Re: Patch: fix lock contention for HASHHDR.mutex - Mailing list pgsql-hackers

From Anastasia Lubennikova
Subject Re: Patch: fix lock contention for HASHHDR.mutex
Date
Msg-id 56A8C3EF.7030604@postgrespro.ru
Whole thread Raw
In response to Re: Patch: fix lock contention for HASHHDR.mutex  (Aleksander Alekseev <a.alekseev@postgrespro.ru>)
Responses Re: Patch: fix lock contention for HASHHDR.mutex  (Aleksander Alekseev <a.alekseev@postgrespro.ru>)
List pgsql-hackers
<br /><br /><div class="moz-cite-prefix">22.01.2016 13:48, Aleksander Alekseev:<br /></div><blockquote
cite="mid:20160122134837.788ecf83@fujitsu"type="cite"><blockquote type="cite"><pre wrap="">Then, this thread became too
tangled.I think it's worth to write a
 
new message with the patch, the test script, some results and brief
overview of how does it really works. It will make following review
much easier.
</pre></blockquote><pre wrap="">
Sure.

HASHHDR represents a hash table. It could be usual or partitioned.
Partitioned table is stored in a shared memory and accessed by multiple
processes simultaneously. To prevent data corruption hash table is
partitioned and each process has to acquire a lock for a corresponding
partition before accessing data in it. Number of partition is determine
by lower bits of key's hash value. Most tricky part is --- dynahash
knows nothing about these locks, all locking is done on calling side.

Since shared memory is pre-allocated and can't grow to allocate memory
in a hash table we use freeList. Also we use nentries field to count
current number of elements in a hash table. Since hash table is used by
multiple processes these fields are protected by a spinlock. Which as
it turned out could cause lock contention and create a bottleneck.

After trying a few approaches I discovered that using one spinlock per
partition works quite well. Here are last benchmark results:

<a class="moz-txt-link-freetext"
href="http://www.postgresql.org/message-id/20151229184851.1bb7d1bd@fujitsu">http://www.postgresql.org/message-id/20151229184851.1bb7d1bd@fujitsu</a>

Note that "no locks" solution cant be used because it doesn't guarantee
that all available memory will be used in some corner cases.

You can find a few more details and a test script in the first message
of this thread. If you have any other questions regarding this patch
please don't hesitate to ask.
</pre></blockquote><div class="moz-text-flowed" lang="x-unicode" style="font-family: -moz-fixed;     font-size:
12px;">InitLocks<br />     /* <br />      * Compute init/max size to request for lock hashtables.  Note these <br />
    * calculations must agree with LockShmemSize! <br />      */ <br /><br /> This comment certainly requires some
changes.<br /> BTW, could you explain why init_table_size was two times less than max_table_size? <br /> Although, I
don'tsee any problems with your changes. <br /><br /><br /> -    hctl->nentries = 0; <br /> -    hctl->freeList =
NULL;<br /><br /> Why did you delete these two lines? I wonder if you should rewrite them instead? <br /><br /> + *
Thisparticular number of partitions significantly reduces lock contention <br /> + * in partitioned hash tables, almost
ifpartitioned tables didn't use any <br /> + * locking at all <br /><br /> As far as I understood, this number was
obtainedexperimentally? Maybe you should note that in the comment. <br /><br /><br /> And the last, but not least.<br
/><br/> +        nelem_alloc = ((int) nelem) / partitions_number; <br /> +        if (nelem_alloc == 0) <br />
+           nelem_alloc = 1; <br /> + <br /> +        for (i = 0; i < partitions_number; i++) <br /> +            if
(!element_alloc(hashp,nelem_alloc, i)) <br /> +                ereport(ERROR, <br /> +                       
(errcode(ERRCODE_OUT_OF_MEMORY),<br /> +                         errmsg("out of memory"))); <br /><br /><br /> It looks
likeyou forgot to round the result of the division.<br /> For example, if you have nelem=25 and partitions_number=6.<br
/>25 / 6 = 4. And then you allocate 24 nelems, while 1 is lost. <br /><br /> What about productivity, I did a test on
mynotebook. Patched version shows small performance improvement.<br /><br /> pgbench -j 1 -c 1 -f pgbench.sql -T 300
postgres<br/> base patched<br /> 127tps  145tps<br /> pgbench -j 8 -c 8 -f pgbench.sql -T 300 postgres<br /> base
patched<br/> 247tps  270tps<br /><br /> But I haven't any big machine to perform tests, where the patch is supposed to
makesignificant changes.<br /> So I would rely on your and the other reviewers results.<br /><br /> Except mentioned
notes,I suppose the patch is good enough to pass it to a reviewer.<br /></div><pre class="moz-signature" cols="72">-- 
 
Anastasia Lubennikova
Postgres Professional: <a class="moz-txt-link-freetext"
href="http://www.postgrespro.com">http://www.postgrespro.com</a>
The Russian Postgres Company</pre>

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Proposal:Use PGDLLEXPORT for libpq
Next
From: Peter Geoghegan
Date:
Subject: Re: Using quicksort for every external sort run