Thread: Did we really want to force an initdb in beta2?
Because that's the consequences of fooling with pg_control. I committed the PG_CONTROL_VERSION bump that was missing from the patch Robert committed last night, but I wonder whether we shouldn't revert the whole thing instead. It's not apparent to me that what it bought is worth forcing beta testers to initdb. regards, tom lane
On 03/06/10 17:54, Tom Lane wrote: > Because that's the consequences of fooling with pg_control. > I committed the PG_CONTROL_VERSION bump that was missing from > the patch Robert committed last night, but I wonder whether > we shouldn't revert the whole thing instead. It's not apparent > to me that what it bought is worth forcing beta testers to initdb. Hmph, good point, I did not think of that at all when I reviewed the patch. If we moved the new DB_SHUTDOWNED_IN_RECOVERY as the last item in the enum, we would stay backwards-compatible. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Thu, Jun 3, 2010 at 11:25 AM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > On 03/06/10 17:54, Tom Lane wrote: >> >> Because that's the consequences of fooling with pg_control. >> I committed the PG_CONTROL_VERSION bump that was missing from >> the patch Robert committed last night, but I wonder whether >> we shouldn't revert the whole thing instead. It's not apparent >> to me that what it bought is worth forcing beta testers to initdb. > > Hmph, good point, I did not think of that at all when I reviewed the patch. > > If we moved the new DB_SHUTDOWNED_IN_RECOVERY as the last item in the enum, > we would stay backwards-compatible. Ugh, sorry about that. I didn't realize this either. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > If we moved the new DB_SHUTDOWNED_IN_RECOVERY as the last item in the > enum, we would stay backwards-compatible. I don't think that's a terribly workable idea; the enum is laid out so that inequality tests are sensible, and I'm not sure there aren't any. The code would look mighty ugly in any case. What exactly was the reason for this patch? Could it be held over till 9.1? regards, tom lane
On 03/06/10 19:16, Tom Lane wrote: > Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes: >> If we moved the new DB_SHUTDOWNED_IN_RECOVERY as the last item in the >> enum, we would stay backwards-compatible. > > I don't think that's a terribly workable idea; the enum is laid out so > that inequality tests are sensible, and I'm not sure there aren't any. Hmm, the only inequality tests on that field I can see check that the value is valid, i.e between the first and last valid value. > The code would look mighty ugly in any case. True. One more hacky idea: Keep the code as it is and change pg_control version back to what it was in beta1. Add a note in the release notes that if you're upgrading from beta1, you must shut down the database cleanly first. When you do that, control file is in DB_SHUTDOWNED state, and the enum value for that did not change. One caveat is that a standby server will be DB_IN_ARCHIVE_RECOVERY, which did change value so that with beta2 binaries it will look like DB_IN_CRASH_RECOVERY. I think that would still work, though (and if not, in the worst case you'll just have to reinitialize the standby from a new base backup). > What exactly was the reason for this patch? Could it be held over till > 9.1? Before the patch, when you shut down a standby server, you get this message in the log on the next startup: LOG: database system was interrupted while in recovery at log time 2010-06-02 14:48:28 EEST HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target. The problem is that that hint is pretty alarming. The data should be fine if the standby server was shut down cleanly with pg_ctl stop -m fast/smart. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > On 03/06/10 19:16, Tom Lane wrote: >> What exactly was the reason for this patch? Could it be held over till >> 9.1? > Before the patch, when you shut down a standby server, you get this > message in the log on the next startup: > LOG: database system was interrupted while in recovery at log time > 2010-06-02 14:48:28 EEST > HINT: If this has occurred more than once some data might be corrupted > and you might need to choose an earlier recovery target. > The problem is that that hint is pretty alarming. Maybe we should just get rid of the hint. regards, tom lane
On Jun 3, 2010, at 19:00 , Tom Lane wrote: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: >> On 03/06/10 19:16, Tom Lane wrote: >>> What exactly was the reason for this patch? Could it be held over till >>> 9.1? > >> Before the patch, when you shut down a standby server, you get this >> message in the log on the next startup: > >> LOG: database system was interrupted while in recovery at log time >> 2010-06-02 14:48:28 EEST >> HINT: If this has occurred more than once some data might be corrupted >> and you might need to choose an earlier recovery target. > >> The problem is that that hint is pretty alarming. > > Maybe we should just get rid of the hint. FYI, Robert Haas suggested the same in the thread that lead to this patch being applied. The arguments against doing thatis that a real crash during recovery *is* something to be quite alarmed about. best regards, Florian Pflug
Florian Pflug <fgp@phlo.org> writes: > On Jun 3, 2010, at 19:00 , Tom Lane wrote: >> Maybe we should just get rid of the hint. > FYI, Robert Haas suggested the same in the thread that lead to this patch being applied. The arguments against doing thatis that a real crash during recovery *is* something to be quite alarmed about. After some discussion among core we're going to leave it as-is. Anybody who doesn't want to initdb for beta2 can test out pg_upgrade ;-) regards, tom lane
On Thu, Jun 3, 2010 at 11:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Florian Pflug <fgp@phlo.org> writes: >> On Jun 3, 2010, at 19:00 , Tom Lane wrote: >>> Maybe we should just get rid of the hint. > >> FYI, Robert Haas suggested the same in the thread that lead to this patch being applied. The arguments against doing thatis that a real crash during recovery *is* something to be quite alarmed about. > > After some discussion among core we're going to leave it as-is. Anybody > who doesn't want to initdb for beta2 can test out pg_upgrade ;-) Shouldn't we have bumped the catversion? The installers can't tell that beta1 clusters won't work with beta2 :-( -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com The Enterprise Postgres Company
Dave Page wrote: > On Thu, Jun 3, 2010 at 11:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Florian Pflug <fgp@phlo.org> writes: > >> On Jun 3, 2010, at 19:00 , Tom Lane wrote: > >>> Maybe we should just get rid of the hint. > > > >> FYI, Robert Haas suggested the same in the thread that lead to this patch being applied. The arguments against doingthat is that a real crash during recovery *is* something to be quite alarmed about. > > > > After some discussion among core we're going to leave it as-is. ?Anybody > > who doesn't want to initdb for beta2 can test out pg_upgrade ;-) > > Shouldn't we have bumped the catversion? The installers can't tell > that beta1 clusters won't work with beta2 :-( That is an interesting point. Tom bumped the pg_control version, but not the catalog version. I am unclear how that affects people's visibility about incompatibility. (pg_upgrade will not care.) -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. +
Bruce Momjian <bruce@momjian.us> writes: > Dave Page wrote: >> Shouldn't we have bumped the catversion? The installers can't tell >> that beta1 clusters won't work with beta2 :-( > That is an interesting point. Tom bumped the pg_control version, but > not the catalog version. Right, because the catalog contents didn't change. Seems to me you'd better teach the installers to look at PG_CONTROL_VERSION too. regards, tom lane
On Fri, Jun 4, 2010 at 2:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Bruce Momjian <bruce@momjian.us> writes: >> Dave Page wrote: >>> Shouldn't we have bumped the catversion? The installers can't tell >>> that beta1 clusters won't work with beta2 :-( > >> That is an interesting point. Tom bumped the pg_control version, but >> not the catalog version. > > Right, because the catalog contents didn't change. Seems to me you'd > better teach the installers to look at PG_CONTROL_VERSION too. Hmm, is there anything else that might need to be checked? -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com The Enterprise Postgres Company
On Fri, Jun 4, 2010 at 11:06 AM, Dave Page <dpage@pgadmin.org> wrote: > On Fri, Jun 4, 2010 at 2:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Bruce Momjian <bruce@momjian.us> writes: >>> Dave Page wrote: >>>> Shouldn't we have bumped the catversion? The installers can't tell >>>> that beta1 clusters won't work with beta2 :-( >> >>> That is an interesting point. Tom bumped the pg_control version, but >>> not the catalog version. >> >> Right, because the catalog contents didn't change. Seems to me you'd >> better teach the installers to look at PG_CONTROL_VERSION too. > > Hmm, is there anything else that might need to be checked? XLOG_PAGE_MAGIC, for one. PG_PAGE_LAYOUT_VERSION doesn't change very often, but might also fall into the same category. Tablespace directory paths depend on the value of PG_MAJORVERSION. It would be nice to have all of these documented somewhere along with the criteria for bumping each one. It's relatively easy for a new committer (ahem) to not realize that there's a version number that needs to be bumped someplace, and recent experience has shown that even an experienced committer can goof. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Dave Page <dpage@pgadmin.org> writes: > On Fri, Jun 4, 2010 at 2:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Right, because the catalog contents didn't change. �Seems to me you'd >> better teach the installers to look at PG_CONTROL_VERSION too. > Hmm, is there anything else that might need to be checked? Offhand I can think of three internal version-like numbers: CATALOG_VERSION_NO --- bump if initial system catalog contents would be inconsistent with backend code PG_CONTROL_VERSION --- bump when contents of pg_control change XLOG_PAGE_MAGIC --- bump on incompatible change in WAL contents regards, tom lane
Tom Lane wrote: > Dave Page <dpage@pgadmin.org> writes: > > On Fri, Jun 4, 2010 at 2:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Right, because the catalog contents didn't change. �Seems to me you'd > >> better teach the installers to look at PG_CONTROL_VERSION too. > > > Hmm, is there anything else that might need to be checked? > > Offhand I can think of three internal version-like numbers: > > CATALOG_VERSION_NO --- bump if initial system catalog contents would be > inconsistent with backend code > > PG_CONTROL_VERSION --- bump when contents of pg_control change > > XLOG_PAGE_MAGIC --- bump on incompatible change in WAL contents pg_upgrade never views these in their raw format so does not need to check them. (It does look at pg_controldata text output.) -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. +
Robert Haas <robertmhaas@gmail.com> writes: > It would be nice to have all of these documented somewhere along with > the criteria for bumping each one. Go for it. I think you have all the raw data in this thread. regards, tom lane
On Fri, Jun 4, 2010 at 4:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Dave Page <dpage@pgadmin.org> writes: >> On Fri, Jun 4, 2010 at 2:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Right, because the catalog contents didn't change. Seems to me you'd >>> better teach the installers to look at PG_CONTROL_VERSION too. > >> Hmm, is there anything else that might need to be checked? > > Offhand I can think of three internal version-like numbers: > > CATALOG_VERSION_NO --- bump if initial system catalog contents would be > inconsistent with backend code > > PG_CONTROL_VERSION --- bump when contents of pg_control change They're easy enough. > XLOG_PAGE_MAGIC --- bump on incompatible change in WAL contents How can I get that from an existing data directory? I don't see it in pg_controldata output (unless it has a non-obvious alias). -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com The Enterprise Postgres Company
Dave Page <dpage@pgadmin.org> writes: > On Fri, Jun 4, 2010 at 4:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> XLOG_PAGE_MAGIC --- bump on incompatible change in WAL contents > How can I get that from an existing data directory? I don't see it in > pg_controldata output (unless it has a non-obvious alias). You'd need to pull it out of one of the WAL files. I'm not sure it's worth the trouble ... regards, tom lane
On Fri, Jun 4, 2010 at 7:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Dave Page <dpage@pgadmin.org> writes: >> On Fri, Jun 4, 2010 at 4:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> XLOG_PAGE_MAGIC --- bump on incompatible change in WAL contents > >> How can I get that from an existing data directory? I don't see it in >> pg_controldata output (unless it has a non-obvious alias). > > You'd need to pull it out of one of the WAL files. I'm not sure it's > worth the trouble ... Urgh, no. Probably not. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com The Enterprise Postgres Company