Re: Scaling shared buffer eviction - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Scaling shared buffer eviction
Date
Msg-id CAA4eK1+5A1+_N+RLBOBzJFfdT4jvaDqJoWQ9cyF=no_4yLO5og@mail.gmail.com
Whole thread Raw
In response to Re: Scaling shared buffer eviction  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Scaling shared buffer eviction
Re: Scaling shared buffer eviction
List pgsql-hackers
On Wed, Aug 27, 2014 at 8:34 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Aug 26, 2014 at 10:53 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> Today, while working on updating the patch to improve locking
> I found that as now we are going to have a new process, we need
> a separate latch in StrategyControl to wakeup that process.
> Another point is I think it will be better to protect
> StrategyControl->completePasses with victimbuf_lck rather than
> freelist_lck, as when we are going to update it we will already be
> holding the victimbuf_lck and it doesn't make much sense to release
> the victimbuf_lck and reacquire freelist_lck to update it.

Sounds reasonable.  I think the key thing at this point is to get a
new version of the patch with the background reclaim running in a
different process than the background writer.  I don't see much point
in fine-tuning the locking regimen until that's done.


I have updated the patch to address the feedback.  Main changes are:

1. For populating freelist, have a separate process (bgreclaimer)
instead of doing it by bgwriter.
2. Autotune the low and high threshold values for buffers
in freelist. I have used the formula as suggested by you upthread.
3. Cleanup of locking regimen as discussed upthread (completely
eliminated BufFreelist Lock).
4. Improved comments and general code cleanup.

I have not yet added statistics (buffers_backend_clocksweep) as
for that we need to add one more variable in BufferStrategyControl
structure where I have already added few variables for this patch.
I think it is important to have such a stat available via
pg_stat_bgwriter, but not sure if it is worth to make the structure
bit more bulky.

Another minor point is about changes in lwlock.h
lwlock.h
* if you remove a lock, consider leaving a gap in the numbering 
* sequence for the benefit of DTrace and other external debugging
* scripts.

As I have removed BufFreelist lock, I have adjusted the numbering
as well in lwlock.h.  There is a meesage on top of lock definitions
which suggest to leave gap if we remove any lock, however I was not
sure whether this case (removing the first element) can effect anything,
so for now, I have adjusted the numbering. 

I have yet to collect data under varying loads, however I have
collected performance data for 8GB shared buffers which shows
reasonably good performance and scalability.

I think the main part left for this patch is more data for various loads
which I will share in next few days, however I think patch is ready for
next round of review, so I will mark it as Needs Review.

Performance Data:
-------------------------------

Configuration and Db Details

IBM POWER-7 16 cores, 64 hardware threads

RAM = 64GB

Database Locale =C

checkpoint_segments=256

checkpoint_timeout    =15min

shared_buffers=8GB

scale factor = 3000

Client Count = number of concurrent sessions and threads (ex. -c 8 -j 8) 

Duration of each individual run = 5mins

All the data is in tps and taken using pgbench read-only load


Client Count/Patch_ver8163264128
HEAD5861410737014071710435765010
Patch60849118701165631209226213029

Note -
a. The numbers are slightly different than previously reported
numbers as earlier I was using debug mode of binaries to take
data and it seems some kind of trace was enabled on m/c.
However the improve in performance and scalability is almost
similar to previous.
b. Above data is median of 3 runs, for detailed data refer attached
document (perf_read_scalability_data_v5.ods)

CPU Usage
------------------
I have observed that CPU usage for new process (reclaimer) is
between 5~9%.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachment

pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: Optimization for updating foreign tables in Postgres FDW
Next
From: Craig Ringer
Date:
Subject: v4 protocol TODO item - Lazy fetch/stream of TOASTed values?