RE: [Patch] Optimize dropping of relation buffers using dlist - Mailing list pgsql-hackers

From k.jamison@fujitsu.com
Subject RE: [Patch] Optimize dropping of relation buffers using dlist
Date
Msg-id OSBPR01MB3207CB149B4DC31B7AC1F062EF770@OSBPR01MB3207.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: [Patch] Optimize dropping of relation buffers using dlist  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [Patch] Optimize dropping of relation buffers using dlist
List pgsql-hackers
On Thurs, November 7, 2019 1:27 AM (GMT+9), Robert Haas wrote:
> On Tue, Nov 5, 2019 at 10:34 AM Tomas Vondra <tomas.vondra@2ndquadrant.com>
> wrote:
> > 2) This adds another hashtable maintenance to BufferAlloc etc. but
> >     you've only done tests / benchmark for the case this optimizes. I
> >     think we need to see a benchmark for workload that allocates and
> >     invalidates lot of buffers. A pgbench with a workload that fits into
> >     RAM but not into shared buffers would be interesting.
> 
> Yeah, it seems pretty hard to believe that this won't be bad for some workloads.
> Not only do you have the overhead of the hash table operations, but you also
> have locking overhead around that. A whole new set of LWLocks where you have
> to take and release one of them every time you allocate or invalidate a buffer
> seems likely to cause a pretty substantial contention problem.

I'm sorry for the late reply. Thank you Tomas and Robert for checking this patch.
Attached is the v3 of the patch.
- I moved the unnecessary items from buf_internals.h to cached_buf.c since most of
  of those items are only used in that file.
- Fixed the bug of v2. Seems to pass both RT and TAP test now

Thanks for the advice on benchmark test. Please refer below for test and results.

[Machine spec]
CPU: 16, Number of cores per socket: 8
RHEL6.5, Memory: 240GB

scale: 3125 (about 46GB DB size)
shared_buffers = 8GB

[workload that fits into RAM but not into shared buffers]
pgbench -i -s 3125 cachetest
pgbench -c 16 -j 8 -T 600 cachetest

[Patched]
scaling factor: 3125
query mode: simple
number of clients: 16
number of threads: 8
duration: 600 s
number of transactions actually processed: 8815123
latency average = 1.089 ms
tps = 14691.436343 (including connections establishing)
tps = 14691.482714 (excluding connections establishing)

[Master/Unpatched]
...
number of transactions actually processed: 8852327
latency average = 1.084 ms
tps = 14753.814648 (including connections establishing)
tps = 14753.861589 (excluding connections establishing)


My patch caused a little overhead of about 0.42-0.46%, which I think is small.
Kindly let me know your opinions/comments about the patch or tests, etc.

Thanks,
Kirk Jamison

Attachment

pgsql-hackers by date:

Previous
From: Alexey Kondratov
Date:
Subject: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Next
From: Antonin Houska
Date:
Subject: Re: Attempt to consolidate reading of XLOG page