Thread: WAL file naming sequence definition

WAL file naming sequence definition

From
"Andrew Hammond"
Date:
I'd confirmation on how WAL files are named. I'm trying to write a tool which can tell me when we are missing a WAL
filefrom the sequence. I initially thought that the file names were monotonically incrementing hexadecimal numbers.
Thisdoesn't appear to be the case.<br /><br />00000001000001B7000000FD<br />00000001000001B7000000FE<br />(there seem
tobe a whole bunch of missing filenames in the sequence here)<br />00000001000001B800000000<br
/>00000001000001B800000001<br/><br />This pattern repeats. I hunted through the code and discovered the following in
src/include/access/xlog_internal.h.<br/><br />#define XLogFilePath(path, tli, log, seg)   \<br />    snprintf(path,
MAXPGPATH,XLOGDIR "/%08X%08X%08X", tli, log, seg)<br /><br />So, the names are not a single hexadecimal number, but
insteadthree of them concatenated together. This macro is used eight times in src/backend/access/xlog.c. It seems clear
thatthe first number, tli, is a TimeLineID. I wasn't completely clear on the behavior of log and seg until I found the
following,also in xlog_internal.h.<br /><br />#define NextLogSeg(logId, logSeg)   \<br />    do { \<br />        if
((logSeg)>= XLogSegsPerFile-1) \<br />        { \<br />            (logId)++; \<br />            (logSeg) = 0; \<br
/>       } \<br />        else \<br />            (logSeg)++; \<br />     } while (0)<br /><br />So, clearly log simply
incrementsand seg increments until it gets up to XLogSegsPerFile. Again, xlog_internal.h knows what that is.<br /><br
/>/*<br/> * We break each logical log file (xlogid value) into segment files of the<br />  * size indicated by
XLOG_SEG_SIZE. One possible segment at the end of each<br /> * log file is wasted, to ensure that we don't have
problemsrepresenting<br /> * last-byte-position-plus-1.<br /> */<br />#define XLogSegSize     ((uint32)
XLOG_SEG_SIZE)<br/> #define XLogSegsPerFile (((uint32) 0xffffffff) / XLogSegSize)<br /><br />In src/include/<a
href="http://pg_config.h.in">pg_config.h.in</a>,I see<br />/* XLOG_SEG_SIZE is the size of a single WAL file. This must
bea power of 2<br />    and larger than XLOG_BLCKSZ (preferably, a great deal larger than<br />   XLOG_BLCKSZ).
ChangingXLOG_SEG_SIZE requires an initdb. */<br />#undef XLOG_SEG_SIZE<br /><br />Then configure tells me the
following<br/><br /># Check whether --with-wal-segsize was given.<br /> if test "${with_wal_segsize+set}" = set;
then<br/>  withval=$with_wal_segsize;<br />  case $withval in<br />    yes)<br />      { { echo "$as_me:$LINENO: error:
argumentrequired for --with-wal-segsize<br />echo "$as_me: error: argument required for --with-wal-segsize option"
>&2;}<br/>    { (exit 1); exit 1; }; }<br />      ;;<br />    no)<br />      { { echo "$as_me:$LINENO: error:
argumentrequired for --with-wal-segsize<br />echo "$as_me: error: argument required for --with-wal-segsize option"
>&2;}<br/>    { (exit 1); exit 1; }; }<br />      ;;<br />    *)<br />      wal_segsize=$withval<br />     
;;<br/>  esac<br /><br />else<br />  wal_segsize=16<br />fi<br /><br /><br />case ${wal_segsize} in<br />  1) ;;<br /> 
2);;<br />  4) ;;<br />  8) ;;<br />  16) ;;<br /> 32) ;;<br /> 64) ;;<br />  *) { { echo "$as_me:$LINENO: error:
InvalidWAL segment size. Allowed values a<br />echo "$as_me: error: Invalid WAL segment size. Allowed values are
1,2,4,8,16,32,<br/>   { (exit 1); exit 1; }; }<br /> esac<br />{ echo "$as_me:$LINENO: result: ${wal_segsize}MB"
>&5<br/>echo "${ECHO_T}${wal_segsize}MB" >&6; }<br /><br />cat >>confdefs.h <<_ACEOF<br
/>#defineXLOG_SEG_SIZE (${wal_segsize} * 1024 * 1024)<br /> _ACEOF<br /><br />Since I didn't specify a wal_segsize at
compiletime, it seems that my XLogSegsPerFile should be<br />0xffffffff / (16 * 1024 * 1024) = 255<br />Which matches
nicelywith what I'm observing.<br /><br />So, and this is where I want the double-check, a tool which verifies there
areno missing WAL files (based on names alone) in a series of WAL files needs to know the following.<br /><br />1)
Timelinehistory (although perhaps not, it could simply verify all existing timelines)<br />2) What, if any, wal_segsize
wasspecified for the database which is generating the WAL files<br /><br />Am I missing anything? The format of .backup
filesseem pretty simple to me. So I intend to do the following.<br /> 1) find the most recent .backup file<br />2)
verifythat all the files required for that .backup exist<br />3) see if there are any newer files, and <br />4) if
thereare newer files, warn if any are missing from the sequence<br /><br />Would this be reasonable and is there any
communityinterest in open-sourcing the tool that I'm building?<br /><br />Andrew<br /><br /> 

Re: WAL file naming sequence definition

From
Simon Riggs
Date:
On Wed, 2008-05-14 at 14:25 -0700, Andrew Hammond wrote:
> I'd confirmation on how WAL files are named. I'm trying to write a
> tool which can tell me when we are missing a WAL file from the
> sequence. I initially thought that the file names were monotonically
> incrementing hexadecimal numbers. This doesn't appear to be the case.
> 
> 00000001000001B7000000FD
> 00000001000001B7000000FE
> (there seem to be a whole bunch of missing filenames in the sequence
> here)
> 00000001000001B800000000
> 00000001000001B800000001

...

> Since I didn't specify a wal_segsize at compile time, it seems that my
> XLogSegsPerFile should be
> 0xffffffff / (16 * 1024 * 1024) = 255
> Which matches nicely with what I'm observing.

Yes, thats the default.

> So, and this is where I want the double-check, a tool which verifies
> there are no missing WAL files (based on names alone) in a series of
> WAL files needs to know the following.
> 
> 1) Timeline history (although perhaps not, it could simply verify all
> existing timelines)
> 2) What, if any, wal_segsize was specified for the database which is
> generating the WAL files

Yes.

I wouldn't worry too much about the timeline id. Getting them sequential
within a single timeline is 99.998% of the problem.

> Am I missing anything? The format of .backup files seem pretty simple
> to me. So I intend to do the following.
> 1) find the most recent .backup file
> 2) verify that all the files required for that .backup exist
> 3) see if there are any newer files, and 
> 4) if there are newer files, warn if any are missing from the sequence
> 
> Would this be reasonable and is there any community interest in
> open-sourcing the tool that I'm building?

Sounds good.

--  Simon Riggs 2ndQuadrant  http://www.2ndQuadrant.com



Re: WAL file naming sequence definition

From
Josh Berkus
Date:
Andrew,

> Would this be reasonable and is there any community interest in
> open-sourcing the tool that I'm building?

yes, definitely.

We shoud find a way to bundle your tool together with other physical 
integrity checking tools.  Eventually we can have a "check crashed 
postgresql suite".

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco