WIP: Pg_upgrade - page layout converter (PLC) hook - Mailing list pgsql-hackers

From Zdenek Kotala
Subject WIP: Pg_upgrade - page layout converter (PLC) hook
Date
Msg-id 4804878B.3040709@sun.com
Whole thread Raw
Responses Re: WIP: Pg_upgrade - page layout converter (PLC) hook  (Heikki Linnakangas <heikki@enterprisedb.com>)
Re: WIP: Pg_upgrade - page layout converter (PLC) hook  (Zdenek Kotala <Zdenek.Kotala@Sun.COM>)
List pgsql-hackers
I attached patch which implemented page layout converter (PLC) hook. It is base
stone for in-place upgrade.

How it works:

When PLC module is loaded, then for each page which does not have native page
version conversion routine is called. Buffer is mark as a dirty and upgraded
page is inserted into WAL.

Performance:

I executed "select count(*) from table" on 2,2GB table (4671039 rows) (without
any tunning) and with conversion 2033s (34min) and after conversion and server
restart 31s (0,5min).

Request for comments:

1) I not sure if calling log_newpage is correct.

   a) Calling from storage something in access method seems to me as bad think.
I'm thinking to move log_newpage to storage, but it invokes more question about
placement, RM ...

   b) log_newpage is used for new page logging, but I use it for storing
converted page. It seems to me that it safe and heap_xlog_newpage correctly
works for new and converted page. I have only doubt about assert macro
mdextend/mdwrite which checks extend vs.write.


2) PLC module placement. I'm looking for best place (directory) where I can put
  PLC code. One possibility is to put under contrib/pg_upgrade another
possibility is to put into backend/storage/upgrade/, but in this location it
will not be possible make it as a module.

3) data structures version tracking

For PLC I need to have old version of data structures like page header, tuple
header and so on. It is also useful for external tools to handle more version of
postgresql easily (e.g. pg_control should show data from all supported
postgresql versions).

My idea is to have for each structure version keep own header e.g. bufpage_03.h,
bufpage_04.h ... which will contain typedef struct PageHeaderData_03 ... and
generic bufpage.h  with following content:

...
#include "bufpage_04.h"
...
typedef PageHeaderData_04 PageHeaderData;

#define PageGetPageSize(page) PageGetPageSize_04(page)
...


4) how to handle corrupted page? If page is corrupted it could invoke false
calling of convert routine. It could hide problems and conversion could "fix" it
in wrong way. Probably we need to have PageHeaderIsValid for all page layout
version.



        Thanks for your comments







Index: src/backend/storage/buffer/bufmgr.c
===================================================================
RCS file: /zfs_data/cvs_pgsql/cvsroot/pgsql/src/backend/storage/buffer/bufmgr.c,v
retrieving revision 1.228
diff -c -r1.228 bufmgr.c
*** src/backend/storage/buffer/bufmgr.c    1 Jan 2008 19:45:51 -0000    1.228
--- src/backend/storage/buffer/bufmgr.c    11 Apr 2008 15:30:28 -0000
***************
*** 41,46 ****
--- 41,47 ----
  #include "storage/proc.h"
  #include "storage/smgr.h"
  #include "utils/resowner.h"
+ #include "access/heapam.h"
  #include "pgstat.h"


***************
*** 67,72 ****
--- 68,75 ----
                                   * bufmgr */
  long        NDirectFileWrite;    /* e.g., I/O in psort and hashjoin. */

+ /* Hook for page layout convertor */
+ plc_hook_type plc_hook = NULL;

  /* local state for StartBufferIO and related functions */
  static volatile BufferDesc *InProgressBuf = NULL;
***************
*** 290,296 ****
--- 293,308 ----
          if (zeroPage)
              MemSet((char *) bufBlock, 0, BLCKSZ);
          else
+         {
              smgrread(reln->rd_smgr, blockNum, (char *) bufBlock);
+             /* Page Layout Convertor hook. We assume that page version is on same place. */
+             if( plc_hook &&  PageGetPageLayoutVersion(bufBlock) != PG_PAGE_LAYOUT_VERSION )
+             {
+                 plc_hook((char *)bufBlock);
+                 bufHdr->flags |= (BM_DIRTY | BM_JUST_DIRTIED);
+                 log_newpage(&reln->rd_node, blockNum ,bufBlock);
+             }
+         }
          /* check for garbage data */
          if (!PageHeaderIsValid((PageHeader) bufBlock))
          {
Index: src/include/storage/bufmgr.h
===================================================================
RCS file: /zfs_data/cvs_pgsql/cvsroot/pgsql/src/include/storage/bufmgr.h,v
retrieving revision 1.111
diff -c -r1.111 bufmgr.h
*** src/include/storage/bufmgr.h    1 Jan 2008 19:45:58 -0000    1.111
--- src/include/storage/bufmgr.h    28 Mar 2008 14:23:03 -0000
***************
*** 28,33 ****
--- 28,37 ----
      BAS_VACUUM                    /* VACUUM */
  } BufferAccessStrategyType;

+ /* Hook for page layout convertor plugin */
+ typedef void (*plc_hook_type)(char *buffer);
+ extern PGDLLIMPORT plc_hook_type plc_hook;
+
  /* in globals.c ... this duplicates miscadmin.h */
  extern PGDLLIMPORT int NBuffers;


pgsql-hackers by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: Re: printTable API (was: Show INHERIT in \du)
Next
From: Stephen Frost
Date:
Subject: Re: Lessons from commit fest