Re: Scaling shared buffer eviction - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Scaling shared buffer eviction
Date
Msg-id CAA4eK1LEEmZvqMJLgmc72sVpTt293mVYFkT2toThvc-QjBL=kg@mail.gmail.com
Whole thread Raw
In response to Re: Scaling shared buffer eviction  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Mon, Jun 9, 2014 at 9:33 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Sun, Jun 8, 2014 at 7:21 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
> > Backend processes related to user connections still
> > performed about 30% of the writes, and this work shows promise
> > toward bringing that down, which would be great; but please don't
> > eliminate the ability to prevent write stalls in the process.
>
>
> I am planing to take some more performance data, part of which will
> be write load as well, but I am now sure if that can anyway show the
> need as mentioned by you.

After taking the performance data for write load using tpc-b with the
patch, I found that there is a regression in it.  So I went ahead and
tried to figure out the reason for same and found that after patch,
Bgwriter started flushing buffers which were required by backends
and reason was that *nextVictimBuffer* was not getting updated
properly while we are running clock sweep kind of logic (decrement
the usage count when number of buffers on freelist fall below low
threshhold value) in Bgwriter.  In HEAD, I noticed that at default
settings, BGwriter was not at all flushing any buffers which is at least
better than what my patch was doing (flushing buffers required by
backend).

So I tried to fix the issue by updating *nextVictimBuffer* in new
BGWriter logic and results are positive.

sbe - scalable buffer eviction

Select only Data
Client count/TPS64 128
Un-patched4523217310
sbe_v3111468114521
sbe_v4153137160752

TPC-B

Client count/TPS
64 128
Un-patched825784
sbe_v4814845


For Select Data, I am quite confident that it will improve if we introduce
nextVictimBuffer increments in BGwriter and rather it scales much better
with that change, however for TPC-B, I am getting fluctuation in data,
so not sure it has eliminated the problem.  The main difference is that in
HEAD, BGwriter never increments nextVictimBuffer during syncing the
buffers, it just notes down the current setting before start and then
proceeds sequentially.

I think it will be good if we can have a new process for moving buffers to
free list due to below reasons:

a. while trying to move buffers to freelist, it should not block due
to in between write activity.
b. The writer should not increment nextVictimBuffer and maintain
the current logic.

One significant change in this version of patch is to use a separate
spin lock to protect nextVictimBuffer rather than using BufFreelistLock.

Suggestions?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: postgresql.auto.conf and reload
Next
From: Alvaro Herrera
Date:
Subject: Re: Re: [BUGS] BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby