Thread: Always bump PG_CONTROL_VERSION?

Always bump PG_CONTROL_VERSION?

From
David Steele
Date:
Hackers,

I would like to propose bumping PG_CONTROL_VERSION with each release 
even if there are no changes to the ControlFileData struct. Historically 
PG_CONTROL_VERSION has only been bumped when there were changes to 
ControlFileData.

pgBackRest uses PG_CONTROL_VERSION to identify the version of PostgreSQL 
when it is not running. If PG_CONTROL_VERSION does not change from a 
prior version then we also use CATALOG_VERSION_NO to uniquely identify 
the version.

This works fine, but is pretty fragile during the alpha/beta releases 
when CATALOG_VERSION_NO is likely to change with each release. Of 
course, PG_CONTROL_VERSION might change as well but this seems to be 
extremely rare for an alpha/beta release.

There are a few commits like eeca4cd3 and 99dd8b05a that would seem to 
argue that bumping PG_CONTROL_VERSION at least once for each release is 
a good idea in general. It doesn't seem too useful to be able to run 
pg_resetwal or pg_controldata against another version, in the few cases 
that it would actually work, e.g. 9.6/9.5.

Thoughts?
-- 
-David
david@pgmasters.net



Re: Always bump PG_CONTROL_VERSION?

From
Alvaro Herrera
Date:
On 2021-May-12, David Steele wrote:

> pgBackRest uses PG_CONTROL_VERSION to identify the version of PostgreSQL
> when it is not running. If PG_CONTROL_VERSION does not change from a prior
> version then we also use CATALOG_VERSION_NO to uniquely identify the
> version.

Why don't you use the PG_VERSION file in the datadir?


-- 
Álvaro Herrera                            39°49'30"S 73°17'W
"I dream about dreams about dreams", sang the nightingale
under the pale moon (Sandman)



Re: Always bump PG_CONTROL_VERSION?

From
Andres Freund
Date:
Hi,

On 2021-05-12 14:58:11 -0400, David Steele wrote:
> pgBackRest uses PG_CONTROL_VERSION to identify the version of PostgreSQL
> when it is not running. If PG_CONTROL_VERSION does not change from a prior
> version then we also use CATALOG_VERSION_NO to uniquely identify the
> version.

Why aren't you using PG_VERSION?

Greetings,

Andres Freund



Re: Always bump PG_CONTROL_VERSION?

From
David Steele
Date:
On 5/12/21 3:21 PM, Alvaro Herrera wrote:
> On 2021-May-12, David Steele wrote:
> 
>> pgBackRest uses PG_CONTROL_VERSION to identify the version of PostgreSQL
>> when it is not running. If PG_CONTROL_VERSION does not change from a prior
>> version then we also use CATALOG_VERSION_NO to uniquely identify the
>> version.
> 
> Why don't you use the PG_VERSION file in the datadir?

Mostly because there is other data we need in pg_control and it is 
simpler to read one file than two.

Regards,
-- 
-David
david@pgmasters.net



Re: Always bump PG_CONTROL_VERSION?

From
Tom Lane
Date:
David Steele <david@pgmasters.net> writes:
> On 5/12/21 3:21 PM, Alvaro Herrera wrote:
>> Why don't you use the PG_VERSION file in the datadir?

> Mostly because there is other data we need in pg_control and it is 
> simpler to read one file than two.

I'm disinclined to change the longstanding rule in this area for
a reason as weak as that.

Even if we did change the rule going forward, you'd still need to
do it properly for existing releases, so I don't see that you're
going to save anything.

            regards, tom lane



Re: Always bump PG_CONTROL_VERSION?

From
Andres Freund
Date:
Hi,

On 2021-05-12 16:18:16 -0400, Tom Lane wrote:
> Even if we did change the rule going forward, you'd still need to
> do it properly for existing releases, so I don't see that you're
> going to save anything.

It turns out that the last time a major version didn't have a unique
control file version was 9.5, I assume that's where David is coming
from.

That said, I don't think it's a good practice to use the control file
version as an identifier for the major version. Who knows, it might be
necessary to add an optional new format in a minor version at some point
or such crazyness. And then there's the beta stuff you'd mentioned, etc.

Greetings,

Andres Freund



Re: Always bump PG_CONTROL_VERSION?

From
David Steele
Date:
On 5/12/21 4:18 PM, Tom Lane wrote:
> David Steele <david@pgmasters.net> writes:
>> On 5/12/21 3:21 PM, Alvaro Herrera wrote:
>>> Why don't you use the PG_VERSION file in the datadir?
> 
>> Mostly because there is other data we need in pg_control and it is
>> simpler to read one file than two.
> 
> I'm disinclined to change the longstanding rule in this area for
> a reason as weak as that.

It's a bit more than that -- for instance we have pg_control in every 
backup but in order to get PG_VERSION we may need to reference a prior 
backup since it never changes. pg_control is also checked on every 
archive_command/restore_command so reading an extra file adds up when 
performance is paramount. There are also many unit tests that need to 
write this data, etc.

In short, it would be very nice to have one place to get info about a 
cluster.

> Even if we did change the rule going forward, you'd still need to
> do it properly for existing releases, so I don't see that you're
> going to save anything.

It's not really a burden for existing releases. The issue is during the 
alpha/beta phase when the CATALOG_VERSION_NO can change several times in 
a few months.

Perhaps it was unwise to frame this in the requirements for an external 
tool, but I still think eeca4cd3 and 99dd8b05a argue for it being a good 
idea.

Or perhaps we could just add the version number to pg_control? At least 
then pg_controldata could display the version.

Regards,
-- 
-David
david@pgmasters.net



Re: Always bump PG_CONTROL_VERSION?

From
Michael Paquier
Date:
On Wed, May 12, 2021 at 01:30:27PM -0700, Andres Freund wrote:
> That said, I don't think it's a good practice to use the control file
> version as an identifier for the major version. Who knows, it might be
> necessary to add an optional new format in a minor version at some point
> or such crazyness. And then there's the beta stuff you'd mentioned, etc.

Yes, PG_VERSION, as you wrote upthread already, is already fine for
the job, and FWIW, I have yet to see a case where being able to easily
detect the minor version in a data folder matters.

And, I am of the opinion to not change the control file version if
there is no need to do so.
--
Michael

Attachment

Re: Always bump PG_CONTROL_VERSION?

From
Julien Rouhaud
Date:
On Thu, May 13, 2021 at 11:04:54AM +0900, Michael Paquier wrote:
> On Wed, May 12, 2021 at 01:30:27PM -0700, Andres Freund wrote:
> > That said, I don't think it's a good practice to use the control file
> > version as an identifier for the major version. Who knows, it might be
> > necessary to add an optional new format in a minor version at some point
> > or such crazyness. And then there's the beta stuff you'd mentioned, etc.
> 
> Yes, PG_VERSION, as you wrote upthread already, is already fine for
> the job, and FWIW, I have yet to see a case where being able to easily
> detect the minor version in a data folder matters.

Would it even make any sense?  It could be "has version X.Y ever started the
cluster", or "was it the last version that started the cluster", or something
else?

> And, I am of the opinion to not change the control file version if
> there is no need to do so.

+1



Re: Always bump PG_CONTROL_VERSION?

From
Jan Wieck
Date:
On 5/12/21 10:04 PM, Michael Paquier wrote:
> On Wed, May 12, 2021 at 01:30:27PM -0700, Andres Freund wrote:
>> That said, I don't think it's a good practice to use the control file
>> version as an identifier for the major version. Who knows, it might be
>> necessary to add an optional new format in a minor version at some point
>> or such crazyness. And then there's the beta stuff you'd mentioned, etc.
> 
> Yes, PG_VERSION, as you wrote upthread already, is already fine for
> the job, and FWIW, I have yet to see a case where being able to easily
> detect the minor version in a data folder matters.

Regardless of PG_VERSION doing the job or not, shouldn't there be a bump 
in PG_CONTROL_VERSION whenever there is a structural or semantic change 
in the control file data? And wouldn't the easiest way to ensure that be 
to bump the version with every release?

Also, can someone give me a good reason NOT to bump the version?


Thanks, Jan

-- 
Jan Wieck
Postgres User since 1994



Re: Always bump PG_CONTROL_VERSION?

From
Tom Lane
Date:
Jan Wieck <jan@wi3ck.info> writes:
> Regardless of PG_VERSION doing the job or not, shouldn't there be a bump 
> in PG_CONTROL_VERSION whenever there is a structural or semantic change 
> in the control file data? And wouldn't the easiest way to ensure that be 
> to bump the version with every release?

No, the way to do that is to change the version number in the commit
that changes the file's contents.

> Also, can someone give me a good reason NOT to bump the version?

It creates unnecessary churn, not to mention a false sense of
complacency.  Bumping the version in the commit that changes
things is not optional, because if you don't do that then you'll
probably burn some other developer also working on HEAD.  So
I don't want people thinking they can skip this because it was
done at the beginning of the development cycle.  We've learned
these things the hard way for CATVERSION.  I think the only reason
that PG_CONTROL version or WAL version might seem different is
that we haven't changed them often enough for people to have fresh
memories of problems.

            regards, tom lane



Re: Always bump PG_CONTROL_VERSION?

From
Jan Wieck
Date:
On 5/13/21 6:45 PM, Tom Lane wrote:
>               Bumping the version in the commit that changes
> things is not optional, because if you don't do that then you'll
> probably burn some other developer also working on HEAD.  So
> I don't want people thinking they can skip this because it was
> done at the beginning of the development cycle.

And we make sure this is done how?


Regards, Jan

-- 
Jan Wieck
Postgres User since 1994



Re: Always bump PG_CONTROL_VERSION?

From
Tom Lane
Date:
Jan Wieck <jan@wi3ck.info> writes:
> On 5/13/21 6:45 PM, Tom Lane wrote:
>> Bumping the version in the commit that changes
>> things is not optional, because if you don't do that then you'll
>> probably burn some other developer also working on HEAD.  So
>> I don't want people thinking they can skip this because it was
>> done at the beginning of the development cycle.

> And we make sure this is done how?

Peer pressure?  It's not that different from N other ways to
do a patch incorrectly, of course.

            regards, tom lane



Re: Always bump PG_CONTROL_VERSION?

From
Andres Freund
Date:
Hi,

On 2021-05-13 17:42:52 -0400, Jan Wieck wrote:
> Also, can someone give me a good reason NOT to bump the version?

There's several types of tools (particularly around backup) that need to
parse control files. Unnecessarily increasing the numbers of versions
that need to be dealt with makes that a bit harder.

Greetings,

Andres Freund



Re: Always bump PG_CONTROL_VERSION?

From
Michael Paquier
Date:
On Mon, May 17, 2021 at 03:47:01PM -0700, Andres Freund wrote:
> There's several types of tools (particularly around backup) that need to
> parse control files. Unnecessarily increasing the numbers of versions
> that need to be dealt with makes that a bit harder.

I am digressing here, sorry for that..

But it is worth noting that it is fun to debug issues where a user
changes the system locale and breaks some logic that parses the output
of pg_controldata because of the translations of the text fields.
I've really wondered over the years whether there should be more
switches to pick up only the field values, for the popular ones like
TLIs for example.  pg_config does that.
--
Michael

Attachment