Thread: [HACKERS] pg_resetwal is broken if run from v10 against older version of PGdata directory
[HACKERS] pg_resetwal is broken if run from v10 against older version of PGdata directory
From
tushar
Date:
Hi, I have installed PG v9.6 / v9.5 , if we run pg_resetwal from v10 binaries against data directory of v9.6/9.5 ,getting this error - centos@centos-cpula bin]$ ./pg_resetwal -D /tmp/pg9.6/bin/data/ pg_resetwal: pg_control exists but is broken or unknown version; ignoring it pg_resetwal: could not open directory "pg_wal": No such file or directory [centos@centos-cpula bin]$ Steps to reproduce- installed PG v9.6 installed PG v10 go to bin directory of v10 and run pg_resetwal , provide -D = data directory of v9.6. -- regards,tushar EnterpriseDB https://www.enterprisedb.com/ The Enterprise PostgreSQL Company
Re: [HACKERS] pg_resetwal is broken if run from v10 against olderversion of PG data directory
From
Amit Kapila
Date:
On Mon, May 29, 2017 at 1:48 PM, tushar <tushar.ahuja@enterprisedb.com> wrote: > Hi, > > I have installed PG v9.6 / v9.5 , if we run pg_resetwal from v10 binaries > against data directory of v9.6/9.5 ,getting this error - > > centos@centos-cpula bin]$ ./pg_resetwal -D /tmp/pg9.6/bin/data/ > pg_resetwal: pg_control exists but is broken or unknown version; ignoring it > This appears to be an expected error as you are trying to use a version of binaries that doesn't match the version stored in pg_control. What makes you think above is a valid usage and should pass? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] pg_resetwal is broken if run from v10 against olderversion of PG data directory
From
tushar
Date:
On 05/29/2017 03:10 PM, Amit Kapila wrote: > What makes you think above is a valid usage and should > pass? with earlier versions ,for instance - v.96 v/s v9.5 ,pg_resetwal was giving pg_control values . Installed v9.6 and v9.5 and run pg_resetwal of v9.6 against data directory of v9.5. [centos@centos-cpula ~]$ /tmp/pg9.6/bin/pg_resetxlog -D /tmp/pg9.5/bin/data/ pg_resetxlog: pg_control exists but is broken or unknown version; ignoring it Guessed pg_control values: pg_control version number: 960 Catalog version number: 201608131 Database system identifier: 6425491233437069295 Latest checkpoint's TimeLineID: 1 Latest checkpoint's full_page_writes: off Latest checkpoint's NextXID: 0:3 Latest checkpoint's NextOID: 10000 Latest checkpoint's NextMultiXactId: 1 Latest checkpoint's NextMultiOffset: 0 Latest checkpoint's oldestXID: 3 Latest checkpoint's oldestXID's DB: 0 Latest checkpoint's oldestActiveXID: 0 Latest checkpoint's oldestMultiXid: 1 Latest checkpoint's oldestMulti's DB: 0 Latest checkpoint's oldestCommitTsXid:0 Latest checkpoint's newestCommitTsXid:0 Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment: 16777216 Maximum length of identifiers: 64 Maximum columns in an index: 32 Maximum size of a TOAST chunk: 1996 Size of a large-object chunk: 2048 Date/time type storage: 64-bit integers Float4 argument passing: by value Float8 argument passing: by value Data page checksum version: 0 Values to be changed: First log segment after reset: 000000010000000000000002 If these values seem acceptable, use -f to force reset. [centos@centos-cpula ~]$ -- regards,tushar EnterpriseDB https://www.enterprisedb.com/ The Enterprise PostgreSQL Company
Re: [HACKERS] pg_resetwal is broken if run from v10 against olderversion of PG data directory
From
Amit Kapila
Date:
On Mon, May 29, 2017 at 3:20 PM, tushar <tushar.ahuja@enterprisedb.com> wrote: > On 05/29/2017 03:10 PM, Amit Kapila wrote: >> >> What makes you think above is a valid usage and should >> pass? > > with earlier versions ,for instance - v.96 v/s v9.5 ,pg_resetwal was giving > pg_control values . > > Installed v9.6 and v9.5 and run pg_resetwal of v9.6 against data directory > of v9.5. > > [centos@centos-cpula ~]$ /tmp/pg9.6/bin/pg_resetxlog -D /tmp/pg9.5/bin/data/ > pg_resetxlog: pg_control exists but is broken or unknown version; ignoring > it > Guessed pg_control values: > I think this happens due to commit f82ec32ac30ae7e3ec7c84067192535b2ff8ec0e which renames pg_xlog to pg_wal. It does take care of making some of the modules like pg_basebackup to understand both old and new directory structures, but not done for all the modules. I think we should make similar changes in pg_resetwal or at the very least update the docs to indicate pg_resetwal can give an error if used against pre-10 data directory. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] pg_resetwal is broken if run from v10 against older version of PG data directory
From
Tom Lane
Date:
tushar <tushar.ahuja@enterprisedb.com> writes: > I have installed PG v9.6 / v9.5 , if we run pg_resetwal from v10 > binaries against data directory of v9.6/9.5 ,getting this error - > centos@centos-cpula bin]$ ./pg_resetwal -D /tmp/pg9.6/bin/data/ > pg_resetwal: pg_control exists but is broken or unknown version; ignoring it > pg_resetwal: could not open directory "pg_wal": No such file or directory This is pretty much exactly what I'd expect. Why would you think that pg_resetwal should work with other major versions? (Maybe we should have it look at PG_VERSION and refuse to do anything if that doesn't have the expected contents.) regards, tom lane
Re: [HACKERS] pg_resetwal is broken if run from v10 against olderversion of PG data directory
From
Michael Paquier
Date:
On Mon, May 29, 2017 at 3:19 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: > I think this happens due to commit > f82ec32ac30ae7e3ec7c84067192535b2ff8ec0e which renames pg_xlog to > pg_wal. It does take care of making some of the modules like > pg_basebackup to understand both old and new directory structures, but > not done for all the modules. I think we should make similar changes > in pg_resetwal or at the very least update the docs to indicate > pg_resetwal can give an error if used against pre-10 data directory. Contrary to pg_basebackup which makes clear in its documentation that it is compatible with past server versions down to 9.1, pg_resetwal does not mention such guarantees. And actually, it seems to me that it is rather unsafe to use it across major versions as the size of ControlFileData varies across major versions so you can write junk bytes in the control file by using pg_resetwal from v10 on a 9.6 server. -- Michael
Re: [HACKERS] pg_resetwal is broken if run from v10 against older version of PG data directory
From
Tom Lane
Date:
Amit Kapila <amit.kapila16@gmail.com> writes: > I think this happens due to commit > f82ec32ac30ae7e3ec7c84067192535b2ff8ec0e which renames pg_xlog to > pg_wal. It does take care of making some of the modules like > pg_basebackup to understand both old and new directory structures, but > not done for all the modules. check > I think we should make similar changes > in pg_resetwal or at the very least update the docs to indicate > pg_resetwal can give an error if used against pre-10 data directory. I think it's just horribly dangerous to run any version of pg_resetxlog/pg_resetwal against any major version other than its own. If people have been doing what the OP tried to do, it's only been sheerest luck if it didn't destroy their database beyond recall. The result will certainly fail to start up, because both pg_control and WAL page header versions will not match the server version. The *best* case scenario is that redoing the reset with the correct version of pg_resetxlog/pg_resetwal gets you out of that with no new damage done. The worst case is pretty awful --- for instance, a truly bullheaded user might try to get out of it by starting the newer server version in the old DB, likely causing irrecoverable catalog corruption. So we need to prevent this, not try to make it work. I don't think we can insist on a version match in pg_control, because part of the point of pg_resetxlog/pg_resetwal is to recover if pg_control is unreadable. But I think we could look at PG_VERSION, which is only a text file. In bad corruption scenarios, if that somehow got corrupted (which seems unlikely since it's never written to post-initdb), you could fill in the correct contents by hand and then pg_resetxlog/pg_resetwal would run. regards, tom lane
Re: [HACKERS] pg_resetwal is broken if run from v10 against olderversion of PG data directory
From
Michael Paquier
Date:
On Mon, May 29, 2017 at 9:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > So we need to prevent this, not try to make it work. I don't think > we can insist on a version match in pg_control, because part of the > point of pg_resetxlog/pg_resetwal is to recover if pg_control is > unreadable. But I think we could look at PG_VERSION, which is only a > text file. In bad corruption scenarios, if that somehow got corrupted > (which seems unlikely since it's never written to post-initdb), > you could fill in the correct contents by hand and then > pg_resetxlog/pg_resetwal would run. Agreed. Shouldn't this be back-patched? PG_CONTROL_VERSION has not been bumped between 9.4 and 9.5. Attached is a patch for HEAD. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] pg_resetwal is broken if run from v10 against older version of PG data directory
From
Tom Lane
Date:
Michael Paquier <michael.paquier@gmail.com> writes: > On Mon, May 29, 2017 at 9:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> So we need to prevent this, not try to make it work. I don't think >> we can insist on a version match in pg_control, because part of the >> point of pg_resetxlog/pg_resetwal is to recover if pg_control is >> unreadable. But I think we could look at PG_VERSION, which is only a >> text file. > Agreed. Shouldn't this be back-patched? PG_CONTROL_VERSION has not > been bumped between 9.4 and 9.5. Attached is a patch for HEAD. Yeah, I'm thinking it would be a good idea to enforce this in all branches. Your patch looks sane in a quick once-over, but I didn't test it. regards, tom lane
Re: [HACKERS] pg_resetwal is broken if run from v10 against olderversion of PG data directory
From
Michael Paquier
Date:
On Mon, May 29, 2017 at 10:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Michael Paquier <michael.paquier@gmail.com> writes: >> On Mon, May 29, 2017 at 9:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> So we need to prevent this, not try to make it work. I don't think >>> we can insist on a version match in pg_control, because part of the >>> point of pg_resetxlog/pg_resetwal is to recover if pg_control is >>> unreadable. But I think we could look at PG_VERSION, which is only a >>> text file. > >> Agreed. Shouldn't this be back-patched? PG_CONTROL_VERSION has not >> been bumped between 9.4 and 9.5. Attached is a patch for HEAD. > > Yeah, I'm thinking it would be a good idea to enforce this in all > branches. Your patch looks sane in a quick once-over, but I didn't > test it. Thanks. I can provide patches for back-branches as needed. -- Michael
Re: [HACKERS] pg_resetwal is broken if run from v10 against older version of PG data directory
From
Tom Lane
Date:
Michael Paquier <michael.paquier@gmail.com> writes: > Agreed. Shouldn't this be back-patched? PG_CONTROL_VERSION has not > been bumped between 9.4 and 9.5. Attached is a patch for HEAD. Pushed with minor adjustments. Notably, I didn't take your addition of canonicalize_path() and referencing the file by absolute path. That seems like an independent and rather debatable behavioral change. All the other files that pg_resetwal touches are referenced with relative paths; if we did want to change the reporting, shouldn't we change it for all of them? regards, tom lane
Re: [HACKERS] pg_resetwal is broken if run from v10 against olderversion of PG data directory
From
Amit Kapila
Date:
On Mon, May 29, 2017 at 9:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Amit Kapila <amit.kapila16@gmail.com> writes: >> I think this happens due to commit >> f82ec32ac30ae7e3ec7c84067192535b2ff8ec0e which renames pg_xlog to >> pg_wal. It does take care of making some of the modules like >> pg_basebackup to understand both old and new directory structures, but >> not done for all the modules. > > check > >> I think we should make similar changes >> in pg_resetwal or at the very least update the docs to indicate >> pg_resetwal can give an error if used against pre-10 data directory. > > I think it's just horribly dangerous to run any version of > pg_resetxlog/pg_resetwal against any major version other than its > own. If people have been doing what the OP tried to do, it's only > been sheerest luck if it didn't destroy their database beyond recall. > The result will certainly fail to start up, because both pg_control > and WAL page header versions will not match the server version. The > *best* case scenario is that redoing the reset with the correct version > of pg_resetxlog/pg_resetwal gets you out of that with no new damage > done. The worst case is pretty awful --- for instance, a truly > bullheaded user might try to get out of it by starting the newer server > version in the old DB, likely causing irrecoverable catalog corruption. > > So we need to prevent this, not try to make it work. > Yeah, I was wrong in suggesting to try to make it work. An explicit error is a better way to deal with this. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] pg_resetwal is broken if run from v10 against olderversion of PG data directory
From
Craig Ringer
Date:
On 30 May 2017 at 00:00, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I think it's just horribly dangerous to run any version of > pg_resetxlog/pg_resetwal You can pretty much stop there ;) -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] pg_resetwal is broken if run from v10 against olderversion of PG data directory
From
Simon Riggs
Date:
On 29 May 2017 at 17:00, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Amit Kapila <amit.kapila16@gmail.com> writes: >> I think this happens due to commit >> f82ec32ac30ae7e3ec7c84067192535b2ff8ec0e which renames pg_xlog to >> pg_wal. It does take care of making some of the modules like >> pg_basebackup to understand both old and new directory structures, but >> not done for all the modules. > > check > >> I think we should make similar changes >> in pg_resetwal or at the very least update the docs to indicate >> pg_resetwal can give an error if used against pre-10 data directory. > > I think it's just horribly dangerous to run any version of > pg_resetxlog/pg_resetwal against any major version other than its > own. Just check the name of the directory so that pg_resetwal will refuse to run against pg_xlog directory -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] pg_resetwal is broken if run from v10 against olderversion of PG data directory
From
Robert Haas
Date:
On Tue, May 30, 2017 at 4:24 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Just check the name of the directory so that pg_resetwal will refuse > to run against pg_xlog directory That's a strictly weaker integrity check than what Tom already committed. That only distinguishes pre-10 from post-10, whereas Tom linked it to the specific major version, which is better. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company