Re: [WIP PATCH] for Performance Improvement in Buffer Management - Mailing list pgsql-hackers

From Amit kapila
Subject Re: [WIP PATCH] for Performance Improvement in Buffer Management
Date
Msg-id 6C0B27F7206C9E4CA54AE035729E9C3828530C8F@szxeml509-mbs
Whole thread Raw
In response to Re: [WIP PATCH] for Performance Improvement in Buffer Management  (Amit kapila <amit.kapila@huawei.com>)
Responses Re: [WIP PATCH] for Performance Improvement in Buffer Management
List pgsql-hackers
On Tuesday, September 04, 2012 6:55 PM Amit kapila wrote:
On Tuesday, September 04, 2012 12:42 AM Jeff Janes wrote:
On Mon, Sep 3, 2012 at 7:15 AM, Amit kapila <amit.kapila@huawei.com> wrote:
>>> This patch is based on below Todo Item:
>
>>> Consider adding buffers the background writer finds reusable to the free
>>> list
>
>
>
>>> I have tried implementing it and taken the readings for Select when all the
>>> data is in either OS buffers
>
>>> or Shared Buffers.
>
>
>
>>> The Patch has simple implementation for  "bgwriter or checkpoint process
>>> moving the unused buffers (unpinned with "ZERO" usage_count buffers) into
>>> "freelist".

>> I don't think InvalidateBuffer can be safely used in this way.  It
>> says "We assume
>> that no other backend could possibly be interested in using the page",
>> which is not true here.

> As I understood and anlyzed based on above, that there is problem in attached patch such that in function
> InvalidateBuffer(), after UnlockBufHdr() and before PartitionLock if some backend uses that buffer and
> increase the usage count to 1, still
> InvalidateBuffer() will remove the buffer from hash table and put it in Freelist.
> I have modified the code to address above by checking refcount & usage_count  inside Partition Lock
> , LockBufHdr and only after that move it to freelist which is similar to InvalidateBuffer.
> In actual code we can optimize the current code by using extra parameter in InvalidateBuffer.

> Please let me know if I understood you correctly or you want to say something else by above comment?

The results for the updated code is attached with this mail.
The scenario is same as in original mail.
    1. Load all the files in to OS buffers (using pg_prewarm with 'read' operation) of all tables and indexes.
    2. Try to load all buffers with "pgbench_accounts" table and "pgbench_accounts_pkey" pages (using pg_prewarm with
'buffers'operation).  
    3. Run the pgbench with select only for 20 minutes.

Platform details:
    Operating System: Suse-Linux 10.2 x86_64
    Hardware : 4 core (Intel(R) Xeon(R) CPU L5408 @ 2.13GHz)
    RAM : 24GB

Server Configuration:
    shared_buffers = 5GB     (1/4 th of RAM size)
    Total data size = 16GB
Pgbench configuration:
        transaction type: SELECT only
        scaling factor: 1200
        query mode: simple
        number of clients: <varying from 8 to 64 >
        number of threads: <varying from 8 to 64 >
        duration: 1200 s

I shall take further readings for following configurations and post the same:
1. The intention for taking with below configuration is that, with the defined testcase, there will be some cases where
I/Ocan happen. So I wanted to check the impact of it. 

Shared_buffers - 7 GB
number of clients: <varying from 8 to 64 >
 number of threads: <varying from 8 to 64 >
transaction type: SELECT only


2.The intention for taking with below configuration is that, with the defined testcase, memory kept for shared buffers
isless then the recommended. So I wanted to check the impact of it. 
Shared_buffers - 2 GB
number of clients: <varying from 8 to 64 >
number of threads: <varying from 8 to 64 >
transaction type: SELECT only


3. The intention for taking with below configuration is that, with the defined testcase, it will test mix of dml
operationswhere there will be I/O due to dml operations. So I wanted to check the impact of it. 
Shared_buffers - 5GB
number of clients: <varying from 8 to 64 >
number of threads: <varying from 8 to 64 >
transaction type: tpc_b

> One problem I could see with proposed change is that in some cases the usage count will get decrement for > a buffer
allocated
> from free list immediately as it can be nextvictimbuffer.
> However there can be solution to this problem.


With Regards,
Amit Kapila.
Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Behavior difference for walsender and walreceiver for n/w breakdown case
Next
From: Noah Misch
Date:
Subject: Re: txid failed epoch increment, again, aka 6291