Re: pg_walfile_name_offset can return inconsistent values - Mailing list pgsql-hackers
From | Matthias van de Meent |
---|---|
Subject | Re: pg_walfile_name_offset can return inconsistent values |
Date | |
Msg-id | CAEze2Wh7qWW6jbULq6mZ14fp_hNh9gM53aq4cG0b1m3DkfLU3w@mail.gmail.com Whole thread Raw |
In response to | Re: pg_walfile_name_offset can return inconsistent values (Bruce Momjian <bruce@momjian.us>) |
Responses |
Re: pg_walfile_name_offset can return inconsistent values
|
List | pgsql-hackers |
On Thu, 9 Nov 2023 at 20:22, Bruce Momjian <bruce@momjian.us> wrote: > I know this bug report is four years old, but it is still a > pg_walfile_name_offset() bug. Here is the bug: > > SELECT * > FROM (VALUES ('0/16ffffff'), ('0/17000000'), ('0/17000001')) AS t(lsn), > LATERAL pg_walfile_name_offset(lsn::pg_lsn); > > lsn | file_name | file_offset > ------------+--------------------------+------------- > 0/16ffffff | 000000010000000000000016 | 16777215 > --> 0/17000000 | 000000010000000000000016 | 0 > 0/17000001 | 000000010000000000000017 | 1 > > The bug is in the indicated line --- it shows the filename as 00016 but > offset as zero, when clearly the LSN is pointing to 17/0. The bug is > essentially that the code for pg_walfile_name_offset() uses the exact > offset from the LSN, but uses the file name from the previous byte of > the LSN. Yes, that's definitely not a correct result. > The fix involves deciding what the description or purpose of > pg_walfile_name_offset() means, and adjusting it to be clearer. The > current documentation says: > > Converts a write-ahead log location to a WAL file name and byte > offset within that file. > > Fix #1: If we assume write-ahead log location means LSN, it is saying > show the file/offset of the LSN, and that is most clearly: > > lsn | file_name | file_offset > ------------+--------------------------+------------- > 0/16ffffff | 000000010000000000000016 | 16777215 > 0/17000000 | 000000010000000000000017 | 0 > 0/17000001 | 000000010000000000000017 | 1 > > Fix #2: Now, there are some who have said they want the output to be > the last written WAL byte (the byte before the LSN), not the current > LSN, for archiving purposes. However, if we do that, we have to update > the docs to clarify it. Its output would be: > > lsn | file_name | file_offset > ------------+--------------------------+------------- > 0/16ffffff | 000000010000000000000016 | 16777214 > 0/17000000 | 000000010000000000000016 | 16777215 > 0/17000001 | 000000010000000000000017 | 0 > > I have attached fix #1 as offset1.diff and fix #2 as offset2.diff. I believe you got the references wrong; fix #1 looks like the output of offset2's changes, and fix #2 looks like the result of offset1's changes. Either way, I think fix #1 is most correct (as was attached in offset2.diff, and quoted verbatim here), because that has no chance of having surprising underflowing behaviour when you use '0/0'::lsn as input. > diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c > index 45a70668b1..e65502d51e 100644 > --- a/src/backend/access/transam/xlogfuncs.c > +++ b/src/backend/access/transam/xlogfuncs.c > @@ -414,7 +414,7 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS) > /* > * xlogfilename > */ > - XLByteToPrevSeg(locationpoint, xlogsegno, wal_segment_size); > + XLByteToSeg(locationpoint, xlogsegno, wal_segment_size); > XLogFileName(xlogfilename, GetWALInsertionTimeLine(), xlogsegno, > wal_segment_size); Kind regards, Matthias van de Meent
pgsql-hackers by date: