Re: Patch: fix lock contention for HASHHDR.mutex - Mailing list pgsql-hackers
From | Aleksander Alekseev |
---|---|
Subject | Re: Patch: fix lock contention for HASHHDR.mutex |
Date | |
Msg-id | 20160301174313.218e04c7@fujitsu Whole thread Raw |
In response to | Re: Patch: fix lock contention for HASHHDR.mutex (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Patch: fix lock contention for HASHHDR.mutex
Re: Patch: fix lock contention for HASHHDR.mutex |
List | pgsql-hackers |
Hello, Amit > I am not sure, if this is exactly what has been suggested by Robert, > so it is not straightforward to see if his suggestion can allow us to > use NUM_FREELISTS as 8 rather than 32. I think instead of trying to > use FREELISTBUFF, why not do it as Robert has suggested and try with > different values of NUM_FREELISTS? Well, what I did is in fact a bit more general solution then Robert suggested. At first I just joined freeList and mutex arrays into one array and tried different NUM_FREELISTS values (see FREELISTS column). That was Robert's idea. Unfortunately it didn't give any considerable speedup comparing with previous approach. Then I tried to manually control sizeof every array item (see different SIZE values). By making it larger we can put every array item into a separate cache line. This approach helped a bit in terms of TPS (before: +33.9%, after: +34.2%) but only if NUM_FREELISTS remains the same (32). So answering your question - it turned out that we _can't_ reduce NUM_FREELISTS this way. Also I don't believe that +0.3% TPS according to synthetic benchmark that doesn't represent any possible workload worth it considering additional code complexity. > In which case, do you think entries can go negative? I think the > nentries is incremented and decremented in the way as without patch, > so I am not getting what can make it go negative. In this context we are discussing a quick and dirty fix "what would happen if FREELIST_IDX would be implemented as getpid() % NUM_FREE_LIST instead of hash % NUM_FREELIST". The code is: int freelist_idx = FREELIST_IDX(hctl, hashvalue); // ... switch (action) { // ... case HASH_REMOVE: if (currBucket != NULL) { if (IS_PARTITIONED(hctl)) SpinLockAcquire(&(FREELIST(hctl, freelist_idx).mutex)); // Assert(FREELIST(hctl, freelist_idx).nentries > 0); FREELIST(hctl, freelist_idx).nentries--; /* remove record from hash bucket's chain. */ *prevBucketPtr = currBucket->link; // ... No matter what freelist was used when element was added to a hash table. We always try to return free memory to the same freelist number getpid() % FREELIST_ITEMS and decrease number of elements in a hash table using corresponding nentries field. For this reason nentries could be negative. Still sum of all nentries remains valid. I realize that this thread is quite long already and that its been a while since previous commitfest ended. So I would like to provide a short summery of this thread from my perspective: 1. Last version of a path is here: http://www.postgresql.org/message-id/CA+Tgmobtf9nH566_jjs=jrTyMq5HdQdaRF5j7o+AbdOWQHE_Ow@mail.gmail.com Its reviewed, tested, well-commented and apparently nobody has strong objections against it. 2. Robert suggested a few possible improvements. Experiments showed that in best case resulting patches are as good as path from item 1, but code become considerably more complex. 3. Number of further changes (changing calling convention for ShmemInitHash(), changing LOG2_NUM_LOCK_PARTITIONS constant) will be discussed separately when and if path from this thread will be applied. > > > > > > > > I am however wondering if it to set the freelist affinity based on > > > something other than the hash value, like say the PID, so that the > > > same process doesn't keep switching to a different freelist for > > > every buffer eviction. > > > > Also I tried to use PID to determine freeList number: > > > > ``` > > #include <sys/types.h> > > #include <unistd.h> > > > > ... > > > > #define FREELIST_IDX(hctl, hashcode) \ > > (IS_PARTITIONED(hctl) ? ((uint32)getpid()) % NUM_FREELISTS : 0) > > > > > ... > > > > // now nentries could be negative in this case > > // Assert(FREELIST(hctl, freelist_idx).nentries > 0); > > > > > In which case, do you think entries can go negative? I think the > nentries is incremented and decremented in the way as without patch, > so I am not getting what can make it go negative. > > > With Regards, > Amit Kapila. > EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: