Re: Move PinBuffer and UnpinBuffer to atomics - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Move PinBuffer and UnpinBuffer to atomics
Date
Msg-id CAA4eK1LzZmaQ7Ky2HSPH0=7a6FowOkxuU4w7DE0JfCL2LzxZYw@mail.gmail.com
Whole thread Raw
In response to Re: Move PinBuffer and UnpinBuffer to atomics  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
List pgsql-hackers
On Mon, Apr 11, 2016 at 7:33 PM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
On Sun, Apr 10, 2016 at 2:24 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
I also tried to run perf top during pgbench and get some interesting results.

Without 5364b357:
   5,69%  postgres                 [.] GetSnapshotData
   4,47%  postgres                 [.] LWLockAttemptLock
   3,81%  postgres                 [.] _bt_compare
   3,42%  postgres                 [.] hash_search_with_hash_value
   3,08%  postgres                 [.] LWLockRelease
   2,49%  postgres                 [.] PinBuffer.isra.3
   1,58%  postgres                 [.] AllocSetAlloc
   1,17%  [kernel]                 [k] __schedule
   1,15%  postgres                 [.] PostgresMain
   1,13%  libc-2.17.so             [.] vfprintf
   1,01%  libc-2.17.so             [.] __memcpy_ssse3_back

With 5364b357:
  18,54%  postgres                 [.] GetSnapshotData
   3,45%  postgres                 [.] LWLockRelease
   3,27%  postgres                 [.] LWLockAttemptLock
   3,21%  postgres                 [.] _bt_compare
   2,93%  postgres                 [.] hash_search_with_hash_value
   2,00%  postgres                 [.] PinBuffer.isra.3
   1,32%  postgres                 [.] AllocSetAlloc
   1,10%  libc-2.17.so             [.] vfprintf

Very surprising.  It appears that after 5364b357, GetSnapshotData consumes more time.  But I can't see anything depending on clog buffers in GetSnapshotData code...

There is a related fact presented by Mithun C Y as well [1] which suggests that Andres's idea of reducing the cost of snapshot shows noticeable gain after increasing the clog buffers.  If you read that thread you will notice that initially we didn't notice much gain by that idea, but with increased clog buffers, it started showing noticeable gain.  If by any chance, you can apply that patch and see the results (latest patch is at [2]).



I took a look at this thread but I still didn't get why number of clog buffers affects read-only benchmark.
Could you please explain it to me in more details?



As already pointed out by Andres, that this is mainly due to shared memory alignment issues.  We have observed that changing some shared memory 
arrangement (structures) some times causes huge differences in performance.  I guess that is the reason why with Cache the snapshot patch, I am seeing the performance gets restored (mainly because it is changing shared memory structures).  I think the right way to fix this is to find which shared structure/structure's needs padding, so that we don't see such fluctuations every time we change some thing is shared memory.



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Parser extensions (maybe for 10?)
Next
From: Noah Misch
Date:
Subject: Re: Parallel Aggregate costs don't consider combine/serial/deserial funcs