Thread: WIP: splitting BLCKSZ
I proposed to explore splitting BLCKSZ into separate values for logging and data to see if there might be anything to gain: http://archives.postgresql.org/pgsql-hackers/2006-03/msg00745.php My first pass was to do more or less a search and replace (attached) and I am already running into trouble with a 'make check' (below). I'm guessing that when initdb is run, I'm not properly saving the values that I've defined for DATA_BLCKSZ and possibly LOG_BLCKSZ. So I'm hoping someone could give me a pointer and I thought it might be a good idea send something out. Thanks, Mark ----- Running in noclean mode. Mistakes will not be cleaned up. The files belonging to this database system will be owned by user "markw". This user must also own the server process. The database cluster will be initialized with locale C. creating directory /home/markw/shell/src/pgsql/src/test/regress/./tmp_check/data ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers/max_fsm_pages ... 3000/150000 creating configuration files ... ok creating template1 database in /home/markw/shell/src/pgsql/src/test/regress/./tmp_check/data/base/1 ... PANIC: databasefiles are incompatible with server DETAIL: The database cluster was initialized with DATA_BLCKSZ 0, but the server was compiled with DATA_BLCKSZ 8192. HINT: It looks like you need to recompile or initdb. child process was terminated by signal 6
Attachment
Mark Wong <markw@osdl.org> writes: > I proposed to explore splitting BLCKSZ into separate values for logging > and data to see if there might be anything to gain: > http://archives.postgresql.org/pgsql-hackers/2006-03/msg00745.php > My first pass was to do more or less a search and replace (attached) and > I am already running into trouble with a 'make check' (below). I'm > guessing that when initdb is run, I'm not properly saving the values > that I've defined for DATA_BLCKSZ and possibly LOG_BLCKSZ. I'd suggest leaving BLCKSZ as-is and inventing XLOG_BLCKSZ to be used only within the WAL code; should make for a *far* smaller patch. Offhand I don't think that anything except xlog.c knows the WAL block size --- it should be fairly closely associated with dependencies on XLOG_SEG_SIZE, if you are looking for something to grep for. regards, tom lane
On Wed, 22 Mar 2006 14:19:48 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote: > Mark Wong <markw@osdl.org> writes: > > I proposed to explore splitting BLCKSZ into separate values for logging > > and data to see if there might be anything to gain: > > http://archives.postgresql.org/pgsql-hackers/2006-03/msg00745.php > > My first pass was to do more or less a search and replace (attached) and > > I am already running into trouble with a 'make check' (below). I'm > > guessing that when initdb is run, I'm not properly saving the values > > that I've defined for DATA_BLCKSZ and possibly LOG_BLCKSZ. > > I'd suggest leaving BLCKSZ as-is and inventing XLOG_BLCKSZ to be used > only within the WAL code; should make for a *far* smaller patch. > Offhand I don't think that anything except xlog.c knows the WAL block > size --- it should be fairly closely associated with dependencies on > XLOG_SEG_SIZE, if you are looking for something to grep for. Ok, I have attached something much smaller. Appears to pass a 'make check' but I'll keep going to make sure it's really correct and works. Thanks, Mark
Attachment
On Thu, 2006-03-23 at 11:27 -0800, Mark Wong wrote: > On Wed, 22 Mar 2006 14:19:48 -0500 > Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > Mark Wong <markw@osdl.org> writes: > > > I proposed to explore splitting BLCKSZ into separate values for logging > > > and data to see if there might be anything to gain: > > > http://archives.postgresql.org/pgsql-hackers/2006-03/msg00745.php > > > My first pass was to do more or less a search and replace (attached) and > > > I am already running into trouble with a 'make check' (below). I'm > > > guessing that when initdb is run, I'm not properly saving the values > > > that I've defined for DATA_BLCKSZ and possibly LOG_BLCKSZ. > > > > I'd suggest leaving BLCKSZ as-is and inventing XLOG_BLCKSZ to be used > > only within the WAL code; should make for a *far* smaller patch. > > Offhand I don't think that anything except xlog.c knows the WAL block > > size --- it should be fairly closely associated with dependencies on > > XLOG_SEG_SIZE, if you are looking for something to grep for. > > Ok, I have attached something much smaller. Appears to pass a 'make > check' but I'll keep going to make sure it's really correct and works. AFAICS this patch won't work properly yet, even though the idea is cool. Backup data blocks have the "hole" removed from them in xlog.c. This is definitely BLCKSZ not XLOG_BLCKSZ at lines 675,798,799 and 813 maybe others. But then you need to put them into a block of size XLOG_BLCKSZ, e.g. line 708. It might be worth looking at xlog.c for 8.0 to see which places used BLCKSZ before hole-removal was introduced in 8.1. I think we should set the default wal_buffers setting to 16 at least now, if we are halving the default size of the blocks. 32 might be more realistic for 8.2. Best Regards, Simon Riggs
Here's an updated patch with help from Simon. Once I get a test system going again in the lab I'll start posting some data. I'm planning a combination of block sizes (BLCKSZ and XLOG_BLCKSZ) and number of WAL buffers. Thanks, Mark
Attachment
On 4/3/06, Mark Wong <markw@osdl.org> wrote: > Once I get a test system going again in the lab I'll start > posting some data. I'm planning a combination of > block sizes (BLCKSZ and XLOG_BLCKSZ) and number > of WAL buffers. Cool. I'm looking forward to the results. -- Jonah H. Harris, Database Internals Architect EnterpriseDB Corporation 732.331.1324
Mark Wong <markw@osdl.org> writes: > Here's an updated patch with help from Simon. Once I get a test system > going again in the lab I'll start posting some data. I'm planning a > combination of block sizes (BLCKSZ and XLOG_BLCKSZ) and number of WAL > buffers. If there's no objection, I'll go ahead and apply the parts of this that create a separate XLOG_BLCKSZ symbol, but not (yet) the parts that actually change any parameter values. I can't see any very good reason why data block size and xlog block size were ever tied together, and I think it'll make the code read better if they're separated. regards, tom lane
Mark Wong <markw@osdl.org> writes: > Here's an updated patch with help from Simon. Once I get a test system > going again in the lab I'll start posting some data. I'm planning a > combination of block sizes (BLCKSZ and XLOG_BLCKSZ) and number of WAL > buffers. Applied with minor corrections (you missed pg_resetxlog, for one). Also, I did not apply the change in default wal_buffers setting, as that seems it should wait for evidence of what XLOG_BLCKSZ ought to be. regards, tom lane
On Mon, 2006-04-03 at 19:37 -0400, Tom Lane wrote: > Mark Wong <markw@osdl.org> writes: > > Here's an updated patch with help from Simon. Once I get a test system > > going again in the lab I'll start posting some data. I'm planning a > > combination of block sizes (BLCKSZ and XLOG_BLCKSZ) and number of WAL > > buffers. > > Applied with minor corrections (you missed pg_resetxlog, for one). Thanks. (That omission was mine, not Mark's.) On Mon, 2006-04-03 at 18:08 -0400, Tom Lane wrote: > I can't see any very good reason > why data block size and xlog block size were ever tied together, and I > think it'll make the code read better if they're separated. I see you've changed the control file back from XLOG_BLCKSZ to BLCKSZ; I wasn't sure which one of those to choose. Perhaps that also should be changed to PGCONTROL_BLCKSZ to more clearly differentiate that also (but not put it in pg_config_manual.h? Best Regards, Simon Riggs
Simon Riggs <simon@2ndquadrant.com> writes: > I see you've changed the control file back from XLOG_BLCKSZ to BLCKSZ; I > wasn't sure which one of those to choose. Hm. The entire point of having a BLCKSZ-sized control file is to have it *not* change in size across format revisions (see the comments) ... which I suppose means that we really ought to have a hard-wired separate constant, rather than depending on something that someone might want to twiddle. regards, tom lane
On Tue, 2006-04-04 at 11:13 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > I see you've changed the control file back from XLOG_BLCKSZ to BLCKSZ; I > > wasn't sure which one of those to choose. > > Hm. The entire point of having a BLCKSZ-sized control file is to have > it *not* change in size across format revisions (see the comments) ... > which I suppose means that we really ought to have a hard-wired separate > constant, rather than depending on something that someone might want to > twiddle. Patch enclosed. Tests with make check and this sequence of actions works fine also... initdb b512 pg_controldata b512 pg_resetxlog b512 pg_controldata b512 Best Regards, Simon Riggs
Attachment
On Tue, 2006-04-04 at 17:33 +0100, Simon Riggs wrote: > On Tue, 2006-04-04 at 11:13 -0400, Tom Lane wrote: > > Simon Riggs <simon@2ndquadrant.com> writes: > > > I see you've changed the control file back from XLOG_BLCKSZ to BLCKSZ; I > > > wasn't sure which one of those to choose. > > > > Hm. The entire point of having a BLCKSZ-sized control file is to have > > it *not* change in size across format revisions (see the comments) ... > > which I suppose means that we really ought to have a hard-wired separate > > constant, rather than depending on something that someone might want to > > twiddle. An additional patch enclosed that adds xlog blcksz onto the xlog long header at the start of each xlog file, so we can cross-check between file and system, as we do with xlog seg size. (This is an *additional* patch, not a replacement one). Best Regards, Simon Riggs
Attachment
Simon Riggs <simon@2ndquadrant.com> writes: > An additional patch enclosed that adds xlog blcksz onto the xlog long > header at the start of each xlog file, so we can cross-check between > file and system, as we do with xlog seg size. That would require an xlog format change (XLOG_PAGE_MAGIC bump). Might be worth it anyway to avoid confusion in PITR log-shipping situations. I thought about it yesterday, and concluded it wasn't really worth the trouble, but am willing to reconsider. Note you forgot pg_resetxlog again .-) regards, tom lane
Simon Riggs <simon@2ndquadrant.com> writes: > On Tue, 2006-04-04 at 11:13 -0400, Tom Lane wrote: >> Hm. The entire point of having a BLCKSZ-sized control file is to have >> it *not* change in size across format revisions (see the comments) ... >> which I suppose means that we really ought to have a hard-wired separate >> constant, rather than depending on something that someone might want to >> twiddle. > Patch enclosed. Applied with minor editorialization (I thought the symbol ought to be defined in pg_control.h, not some random xlog file). regards, tom lane
On Tue, 2006-04-04 at 18:41 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > On Tue, 2006-04-04 at 11:13 -0400, Tom Lane wrote: > >> Hm. The entire point of having a BLCKSZ-sized control file is to have > >> it *not* change in size across format revisions (see the comments) ... > >> which I suppose means that we really ought to have a hard-wired separate > >> constant, rather than depending on something that someone might want to > >> twiddle. > > > Patch enclosed. > > Applied with minor editorialization Thanks. > (I thought the symbol ought to be > defined in pg_control.h, not some random xlog file). It wasn't a random xlog header file. The symbol was placed adjacent to this line in xlog_internal.h #define XLOG_CONTROL_FILE "global/pg_control" Perhaps that should be moved to pg_control.h also? Or at least a comment in pg_control to indicate where pg_control's location is defined. Best Regards, Simon Riggs
On Tue, 2006-04-04 at 15:26 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > An additional patch enclosed that adds xlog blcksz onto the xlog long > > header at the start of each xlog file, so we can cross-check between > > file and system, as we do with xlog seg size. > > That would require an xlog format change (XLOG_PAGE_MAGIC bump). Might > be worth it anyway to avoid confusion in PITR log-shipping situations. > I thought about it yesterday, and concluded it wasn't really worth the > trouble, but am willing to reconsider. Thanks, > Note you forgot pg_resetxlog again .-) Eyeball polishing now in progress. Best Regards, Simon Riggs