Thread: pg_rewind copies
If a file is modified and becomes larger in the source system while pg_rewind is running, pg_rewind can leave behind a partial copy of file. That's by design, and it's OK for relation files because they're replayed from WAL. But it can cause trouble for configuration files. I ran into this while playing with pg_auto_failover. After failover, pg_auto_failover would often launch pg_rewind, and run ALTER SYSTEM on the primary while pg_rewind was running. The resulting rewound system would fail to start up: Nov 13 09:24:42 pg-node-a pg_autoctl[2217]: 09:24:42 2220 ERROR 2020-11-13 09:24:32.547 GMT [2246] LOG: syntax error in file "/data/pgdata/postgresql.auto.conf" line 4, near token "'" Nov 13 09:24:42 pg-node-a pg_autoctl[2217]: 09:24:42 2220 ERROR 2020-11-13 09:24:32.547 GMT [2246] FATAL: configuration file "postgresql.auto.conf" contains errors Attached is a patch to mitigate that. It changes pg_rewind so that when it copies a whole file, it ignores the original file size. It's not a complete cure: it still believes the original size for files larger than 1 MB. That limit was just expedient given the way the chunking logic in libpq_source.c works, but should be enough for configuration files. There's another race condition that this doesn't try to fix: If a file is modified while it's being copied, you can have a torn file with one half of the file from the old version, and one half from the new. That's a much more narrow window, though, and pg_basebackup has the same problem. - Heikki
Attachment
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation: not tested Hello The patch seems to do as described and the regression and tap tests are passing + /* + * A local source is not expected to change while we're rewinding, so check + * that we size of the file matches our earlier expectation. + */ Is this a tyoo? thanks Cary
On 16/12/2020 00:08, Cary Huang wrote: > The following review has been posted through the commitfest application: > make installcheck-world: tested, passed > Implements feature: tested, passed > Spec compliant: tested, passed > Documentation: not tested > > Hello > > The patch seems to do as described and the regression and tap tests are passing > + /* > + * A local source is not expected to change while we're rewinding, so check > + * that we size of the file matches our earlier expectation. > + */ > Is this a tyoo? Yep, thanks! Attached is a new patch version, with that fixed and rebased. No other changes. - Heikki
Attachment
On 1/22/21 8:26 AM, Heikki Linnakangas wrote: > On 16/12/2020 00:08, Cary Huang wrote: >> The following review has been posted through the commitfest application: >> make installcheck-world: tested, passed >> Implements feature: tested, passed >> Spec compliant: tested, passed >> Documentation: not tested >> >> Hello >> >> The patch seems to do as described and the regression and tap tests >> are passing >> + /* >> + * A local source is not expected to change while we're >> rewinding, so check >> + * that we size of the file matches our earlier expectation. >> + */ >> Is this a tyoo? > > Yep, thanks! Attached is a new patch version, with that fixed and > rebased. No other changes. Cary, does this patch look ready to commit? If so, please change the state in the CF entry to "Ready for Committer". Regards, -- -David david@pgmasters.net
> On 25 Mar 2021, at 14:32, David Steele <david@pgmasters.net> wrote: > > On 1/22/21 8:26 AM, Heikki Linnakangas wrote: >> On 16/12/2020 00:08, Cary Huang wrote: >>> The following review has been posted through the commitfest application: >>> make installcheck-world: tested, passed >>> Implements feature: tested, passed >>> Spec compliant: tested, passed >>> Documentation: not tested >>> >>> Hello >>> >>> The patch seems to do as described and the regression and tap tests are passing >>> + /* >>> + * A local source is not expected to change while we're rewinding, so check >>> + * that we size of the file matches our earlier expectation. >>> + */ >>> Is this a tyoo? >> Yep, thanks! Attached is a new patch version, with that fixed and rebased. No other changes. > > Cary, does this patch look ready to commit? If so, please change the state in the CF entry to "Ready for Committer". Reading the patch I think it definitely qualifies for RFC state. Heikki, do you have plans to address till patch in the ongoing CF? -- Daniel Gustafsson https://vmware.com/
I took another look at this patch, and I think it's ready to go in, it clearly fixes a bug that isn't too hard to hit in production settings. To ensure we don't break this I've added a testcase which pipes the pg_rewind --verbose output to a file it's asked to copy, which then guarantees that the file is growing in size during the operation without need for synchronizing two processes with IPC::Run (it also passes on Windows in the CI setup). One small comment on the patch: + snprintf(srcpath, sizeof(srcpath), "%s/%s", datadir, path); This should IMO check the returnvalue of snprintf to ensure it wasn't truncated. While the risk is exceedingly small, a truncated filename might match another existing filename and the error not getting caught. There is another instance just like this one in open_target_file() to which I think we should apply the same belts-and-suspenders treatment. I've fixed this in the attached version which also have had a pg_indent run on top of a fresh rebase. Thoughts on this version? -- Daniel Gustafsson https://vmware.com/
Attachment
On 01/04/2022 12:00, Daniel Gustafsson wrote: > I took another look at this patch, and I think it's ready to go in, it clearly > fixes a bug that isn't too hard to hit in production settings. To ensure we > don't break this I've added a testcase which pipes the pg_rewind --verbose > output to a file it's asked to copy, which then guarantees that the file is > growing in size during the operation without need for synchronizing two > processes with IPC::Run (it also passes on Windows in the CI setup). > > One small comment on the patch: > > + snprintf(srcpath, sizeof(srcpath), "%s/%s", datadir, path); > > This should IMO check the returnvalue of snprintf to ensure it wasn't > truncated. While the risk is exceedingly small, a truncated filename might > match another existing filename and the error not getting caught. There is > another instance just like this one in open_target_file() to which I think we > should apply the same belts-and-suspenders treatment. I've fixed this in the > attached version which also have had a pg_indent run on top of a fresh rebase. > + if (len >= sizeof(dstpath)) > + pg_fatal("filepath buffer too small"); /* shouldn't happen */ Makes sense. I would remove the "shouldn't happen"; it's not very hard to make it happen, you just need a very long target datadir path. And rephrase the error message as "datadir path too long". One typo in the commit message: s/update/updates/. Thanks! - Heikki
> On 1 Apr 2022, at 12:46, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> + if (len >= sizeof(dstpath)) >> + pg_fatal("filepath buffer too small"); /* shouldn't happen */ > > Makes sense. I would remove the "shouldn't happen"; it's not very hard to make it happen, you just need a very long targetdatadir path. And rephrase the error message as "datadir path too long". Right, good point. > One typo in the commit message: s/update/updates/. Will fix. -- Daniel Gustafsson https://vmware.com/
On 01.04.22 11:00, Daniel Gustafsson wrote: > One small comment on the patch: > > + snprintf(srcpath, sizeof(srcpath), "%s/%s", datadir, path); > > This should IMO check the returnvalue of snprintf to ensure it wasn't > truncated. While the risk is exceedingly small, a truncated filename might > match another existing filename and the error not getting caught. There is > another instance just like this one in open_target_file() to which I think we > should apply the same belts-and-suspenders treatment. I've fixed this in the > attached version which also have had a pg_indent run on top of a fresh rebase. We use snprintf() like that countless times, and approximately none of them check for overflow. So while you are right, this might not be the place to start a new policy. If you don't like this approach, use psprintf() perhaps.
> On 4 Apr 2022, at 15:08, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > We use snprintf() like that countless times, and approximately none of them check for overflow. So while you are right,this might not be the place to start a new policy. Fair enough, I'll remove these hunks before committing and will look a bigger picture patch to address these across the codebase later. -- Daniel Gustafsson https://vmware.com/