Re: Sorted writes in checkpoint - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Sorted writes in checkpoint
Date
Msg-id 467140FE.1080404@enterprisedb.com
Whole thread Raw
In response to Sorted writes in checkpoint  (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>)
List pgsql-hackers
ITAGAKI Takahiro wrote:
> Greg Smith <gsmith@gregsmith.com> wrote:
>> On Mon, 11 Jun 2007, ITAGAKI Takahiro wrote:
>>> If the kernel can treat sequential writes better than random writes, is 
>>> it worth sorting dirty buffers in block order per file at the start of 
>>> checkpoints?
> 
> I wrote and tested the attached sorted-writes patch base on Heikki's
> ldc-justwrites-1.patch. There was obvious performance win on OLTP workload.
> 
>   tests                    | pgbench | DBT-2 response time (avg/90%/max)
> ---------------------------+---------+-----------------------------------
>  LDC only                  | 181 tps | 1.12 / 4.38 / 12.13 s
>  + BM_CHECKPOINT_NEEDED(*) | 187 tps | 0.83 / 2.68 /  9.26 s
>  + Sorted writes           | 224 tps | 0.36 / 0.80 /  8.11 s
> 
> (*) Don't write buffers that were dirtied after starting the checkpoint.
> 
> machine : 2GB-ram, SCSI*4 RAID-5
> pgbench : -s400 -t40000 -c10  (about 5GB of database)
> DBT-2   : 60WH (about 6GB of database)

Wow, I didn't expect that much gain from the sorted writes. How was LDC 
configured?

>> 3) The OS disk elevator should be dealing with this issue, particularly 
>> because it may really know the actual disk ordering.

Yeah, but we don't give the OS that much chance to coalesce writes when 
we spread them out.

>> Here's the subtle thing:  by writing in the same order the LRU scan occurs 
>> in, you are writing dirty buffers in the optimal fashion to eliminate 
>> client backend writes during BuferAlloc.  This makes the checkpoint a 
>> really effective LRU clearing mechanism.  Writing in block order will 
>> change that.
> 
> The issue will probably go away after we have LDC, because it writes LRU
> buffers during checkpoints.

I think so too.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: ecpg regression broken on mingw
Next
From: Alvaro Herrera
Date:
Subject: Re: DROP TABLE and autovacuum