Thread: Use XLogFromFileName() in pg_resetwal to parse position from WAL file
Hi, It looks like there's an opportunity to replace explicit WAL file parsing code with XLogFromFileName() in pg_resetwal.c. This was not done then (in PG 10) because the XLogFromFileName() wasn't accepting file size as an input parameter (see [1]) and pg_resetwal needed to use WAL file size from the controlfile. Thanks to the commit fc49e24fa69a15efacd5b8958115ed9c43c48f9a which added the wal_segsz_bytes parameter to XLogFromFileName(). I'm attaching a small patch herewith. This removes using extra variables in pg_resetwal.c and a bit of duplicate code too (5 LOC). Thoughts? [1] https://github.com/postgres/postgres/blob/REL_10_STABLE/src/include/access/xlog_internal.h -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Use XLogFromFileName() in pg_resetwal to parse position from WAL file
From
Kyotaro Horiguchi
Date:
At Tue, 4 Oct 2022 11:06:15 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > It looks like there's an opportunity to replace explicit WAL file > parsing code with XLogFromFileName() in pg_resetwal.c. This was not > done then (in PG 10) because the XLogFromFileName() wasn't accepting > file size as an input parameter (see [1]) and pg_resetwal needed to > use WAL file size from the controlfile. Thanks to the commit > fc49e24fa69a15efacd5b8958115ed9c43c48f9a which added the > wal_segsz_bytes parameter to XLogFromFileName(). Nice finding. I found a few '%08X%08X's but they don't seem to fit similar fix. > I'm attaching a small patch herewith. This removes using extra > variables in pg_resetwal.c and a bit of duplicate code too (5 LOC). > > Thoughts? > - segs_per_xlogid = (UINT64CONST(0x0000000100000000) / ControlFile.xlog_seg_size); > newXlogSegNo = ControlFile.checkPointCopy.redo / ControlFile.xlog_seg_size; Couldn't we use XLByteToSeg() here? Other than that, it looks good to me. > [1] https://github.com/postgres/postgres/blob/REL_10_STABLE/src/include/access/xlog_internal.h regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Use XLogFromFileName() in pg_resetwal to parse position from WAL file
From
Bharath Rupireddy
Date:
On Tue, Oct 4, 2022 at 11:47 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > > - segs_per_xlogid = (UINT64CONST(0x0000000100000000) / ControlFile.xlog_seg_size); > > newXlogSegNo = ControlFile.checkPointCopy.redo / ControlFile.xlog_seg_size; > > Couldn't we use XLByteToSeg() here? Yes, we could. > Other than that, it looks good to me. Thanks. There are a few more assorted WAL file related things I found, I will be sending all of them in this thread itself in a while. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Use XLogFromFileName() in pg_resetwal to parse position from WAL file
From
Kyotaro Horiguchi
Date:
At Tue, 04 Oct 2022 15:17:06 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > Other than that, it looks good to me. Sorry I have another comment. > - unsigned int tli, > - log, > - seg; > + unsigned int tli; > XLogSegNo segno; TLI should be of type TimeLineID. This is not directly related to this patch, pg_resetwal.c has the following line.. > static uint32 minXlogTli = 0; regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Use XLogFromFileName() in pg_resetwal to parse position from WAL file
From
Michael Paquier
Date:
On Tue, Oct 04, 2022 at 03:17:06PM +0900, Kyotaro Horiguchi wrote: > Nice finding. I found a few '%08X%08X's but they don't seem to fit > similar fix. Nice cleanup. > Couldn't we use XLByteToSeg() here? > > Other than that, it looks good to me. Yep. It looks that you're right here. -- Michael
Attachment
Re: Use XLogFromFileName() in pg_resetwal to parse position from WAL file
From
Kyotaro Horiguchi
Date:
At Tue, 04 Oct 2022 15:23:48 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > This is not directly related to this patch, pg_resetwal.c has the > following line.. > > > static uint32 minXlogTli = 0; I have found other three instances of this in xlog.c and pg_receivewal.c. Do they worth fixing? (pg_upgarade.c has "uint32 tli/logid/segno but I'm not sure they need to be "fixed". At least the segno is not a XLogSegNo.) regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Assorted fixes related to WAL files (was: Use XLogFromFileName() in pg_resetwal to parse position from WAL file)
From
Bharath Rupireddy
Date:
On Tue, Oct 4, 2022 at 12:11 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > > > > > static uint32 minXlogTli = 0; > > I have found other three instances of this in xlog.c and > pg_receivewal.c. Do they worth fixing? > > (pg_upgarade.c has "uint32 tli/logid/segno but I'm not sure they need > to be "fixed". At least the segno is not a XLogSegNo.) There are quite a number of places where data types need to be fixed, see XLogFileNameById() callers. They are all being parsed as uint32 and then used. I'm not sure if we want to fix all of them. I think I found that we can fix/refactor few WAL file related things: 1. 0001 replaces explicit WAL file parsing code with XLogFromFileName() and uses XLByteToSeg() in pg_resetwal.c. This was not done then (in PG 10) because the XLogFromFileName() wasn't accepting file size as an input parameter and pg_resetwal needed to use WAL file size from the controlfile. Thanks to the commit fc49e24fa69a15efacd5b8958115ed9c43c48f9a which added the wal_segsz_bytes parameter to XLogFromFileName(). This removes using extra variables in pg_resetwal.c and a bit of duplicate code too. It also replaces the explicit code with the XLByteToSeg() macro. 2. 0002 replaces MAXPGPATH with MAXFNAMELEN for WAL file names. MAXFNAMELEN (64 bytes) is typically meant to be used for all WAL file names across the code base. Because the WAL file names in postgres can't be bigger than 64 bytes, in fact, not more than XLOG_FNAME_LEN (24 bytes) but there are suffixes, timeline history files etc. To accommodate all of that MAXFNAMELEN is introduced. There are some places in the code base that still use MAXPGPATH (1024 bytes) for WAL file names which is an unnecessary wastage of stack memory. This makes code consistent across and saves a bit of space. 3. 0003 replaces WAL file name calculation with XLogFileNameById() in pg_upgrade/controldata.c to be consistent across the code base. Note that this requires us to change the nextxlogfile size from hard-coded 25 bytes to MAXFNAMELEN (64 bytes). I'm attaching the v2 patch set. Thoughts? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Re: Assorted fixes related to WAL files (was: Use XLogFromFileName() in pg_resetwal to parse position from WAL file)
From
Kyotaro Horiguchi
Date:
At Tue, 4 Oct 2022 13:20:54 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > On Tue, Oct 4, 2022 at 12:11 PM Kyotaro Horiguchi > <horikyota.ntt@gmail.com> wrote: > > > > > > > > > static uint32 minXlogTli = 0; > > > > I have found other three instances of this in xlog.c and > > pg_receivewal.c. Do they worth fixing? > > > > (pg_upgarade.c has "uint32 tli/logid/segno but I'm not sure they need > > to be "fixed". At least the segno is not a XLogSegNo.) > > There are quite a number of places where data types need to be fixed, > see XLogFileNameById() callers. They are all being parsed as uint32 > and then used. I'm not sure if we want to fix all of them. > > I think I found that we can fix/refactor few WAL file related things: > > 1. 0001 replaces explicit WAL file parsing code with > XLogFromFileName() and uses XLByteToSeg() in pg_resetwal.c. This was > not done then (in PG 10) because the XLogFromFileName() wasn't > accepting file size as an input parameter and pg_resetwal needed to > use WAL file size from the controlfile. Thanks to the commit > fc49e24fa69a15efacd5b8958115ed9c43c48f9a which added the > wal_segsz_bytes parameter to XLogFromFileName(). This removes using > extra variables in pg_resetwal.c and a bit of duplicate code too. It > also replaces the explicit code with the XLByteToSeg() macro. Looks good to me. > 2. 0002 replaces MAXPGPATH with MAXFNAMELEN for WAL file names. > MAXFNAMELEN (64 bytes) is typically meant to be used for all WAL file > names across the code base. Because the WAL file names in postgres > can't be bigger than 64 bytes, in fact, not more than XLOG_FNAME_LEN > (24 bytes) but there are suffixes, timeline history files etc. To > accommodate all of that MAXFNAMELEN is introduced. There are some > places in the code base that still use MAXPGPATH (1024 bytes) for WAL > file names which is an unnecessary wastage of stack memory. This makes > code consistent across and saves a bit of space. Looks reasonable, too. I don't find other instances of the same mistake. > 3. 0003 replaces WAL file name calculation with XLogFileNameById() in > pg_upgrade/controldata.c to be consistent across the code base. Note > that this requires us to change the nextxlogfile size from hard-coded > 25 bytes to MAXFNAMELEN (64 bytes). I'm not sure I like this. In other places where XLogFileNameById() is used, the buffer is known to store longer strings so MAXFNAMELEN is reasonable. But we don't need to add useless 39 bytes here. Therefore, even if I wanted to change it, I would replace it with "XLOG_FNAME_LEN + 1". > I'm attaching the v2 patch set. > > Thoughts? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Assorted fixes related to WAL files (was: Use XLogFromFileName() in pg_resetwal to parse position from WAL file)
From
Bharath Rupireddy
Date:
On Tue, Oct 4, 2022 at 2:01 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > > 1. 0001 replaces explicit WAL file parsing code with > > Looks good to me. > > > 2. 0002 replaces MAXPGPATH with MAXFNAMELEN for WAL file names. > > Looks reasonable, too. I don't find other instances of the same mistake. Thanks for reviewing. > > 3. 0003 replaces WAL file name calculation with XLogFileNameById() in > > pg_upgrade/controldata.c to be consistent across the code base. Note > > that this requires us to change the nextxlogfile size from hard-coded > > 25 bytes to MAXFNAMELEN (64 bytes). > > I'm not sure I like this. In other places where XLogFileNameById() is > used, the buffer is known to store longer strings so MAXFNAMELEN is > reasonable. But we don't need to add useless 39 bytes here. > Therefore, even if I wanted to change it, I would replace it with > "XLOG_FNAME_LEN + 1". I'm fine with doing either of these things. Let's hear from others. I've added a CF entry - https://commitfest.postgresql.org/40/3927/ -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Assorted fixes related to WAL files (was: Use XLogFromFileName() in pg_resetwal to parse position from WAL file)
From
Michael Paquier
Date:
On Tue, Oct 04, 2022 at 06:24:18PM +0530, Bharath Rupireddy wrote: > I'm fine with doing either of these things. Let's hear from others. > > I've added a CF entry - https://commitfest.postgresql.org/40/3927/ About 0002, I am not sure that it is worth bothering. Sure, this wastes a few bytes, but I recall that there are quite a few places in the code where we imply a WAL segment but append a full path to it, and this creates a few bumps with back-patches. --- a/src/bin/pg_upgrade/pg_upgrade.h +++ b/src/bin/pg_upgrade/pg_upgrade.h @@ -10,6 +10,7 @@ #include <sys/stat.h> #include <sys/time.h> +#include "access/xlog_internal.h" #include "common/relpath.h" #include "libpq-fe.h" Well, xlog_internal.h includes a few backend-only definitions. I'd rather have us untangle more the backend/frontend dependencies before adding more includes of this type, even if I agree that nextxlogfile would be better with the implied name size limit. Saying that, 0001 is a nice catch, so applied it. I have switched the two TLI variables to use TimeLineID, while touching the area. -- Michael