Re: Separate BLCKSZ for data and logging - Mailing list pgsql-hackers

From Qingqing Zhou
Subject Re: Separate BLCKSZ for data and logging
Date
Msg-id dvd4ss$82m$1@news.hub.org
Whole thread Raw
In response to Separate BLCKSZ for data and logging  (Mark Wong <markw@osdl.org>)
List pgsql-hackers
"Simon Riggs" <simon@2ndquadrant.com> wrote
>
> I think Tom's right... the OS blocksize is smaller than BLCKSZ, so
> reducing the size might help with a very high transaction load when
> commits are required very frequently. At checkpoint it sounds like we
> might benefit from a large WAL blocksize because of all the additional
> blocks written, but we often write more than one block at a time anyway,
> and that still translates to multiple OS blocks whichever way you cut
> it, so I'm not convinced yet.
>

As I observed from other database system, they really did something like
this. You can see the disk write sequence is something like this:
   512   512   2048   4196   32768   512   ...

That is, the xlog write bytes will always align to the disk sector size
(required by O_DIRECT), and try to write out as much as possible (but within
a upper bound like 32768 I guess). As I understand, this change would not
take too much trouble, maybe a local change in XlogWrite() is enough.

Regards,
Qingqing




pgsql-hackers by date:

Previous
From: "Dann Corbit"
Date:
Subject: Re: qsort, once again
Next
From: "William ZHANG"
Date:
Subject: Re: Bug report form: locale/encoding