Thread: Tracking cluster upgrade and configuration history

Tracking cluster upgrade and configuration history

From
Mark Dilger
Date:
Hello Hackers,

While supporting customers, it would frequently be useful to have more information about the history of a cluster.  For
example,which prior versions were ever installed and running on the cluster?  Has the cluster ever been run with
fsync=off? When did the server last enter recovery, if ever?  Was a backup_label file present at that time? 

Some of this type of information could strictly be fixed size, such as a fixed set of timestamps for the time at which
afixed set of things last occurred, or a fixed set of bits indicating whether a fixed set of things ever happened. 

Some other types would be variable size, but hopefully short in practice, like a list of all postgres versions that
haveever been run on the cluster. 

Logging the information via the usual log mechanism seems insufficient, as log files may get rotated and this
informationlost. 

Would it be acceptable to store some fixed set of flag bits and timestamps in pg_control?  Space there is at a premium.

Would it make sense to alternately, or additionally, store some of this information in a flat text file in pg_data, say
anew file named "cluster_history" or such? 

I'm happy to put together a more concrete proposal, but solicit your opinions first on the merits of the idea
generally,and if you think the idea good, on the specifics you'd like to see included. 

Thanks!

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Tracking cluster upgrade and configuration history

From
Bharath Rupireddy
Date:
On Thu, Nov 12, 2020 at 4:01 AM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
>
> While supporting customers, it would frequently be useful to have more information about the history of a cluster.
Forexample, which prior versions were ever installed and running on the cluster?  Has the cluster ever been run with
fsync=off? When did the server last enter recovery, if ever?  Was a backup_label file present at that time? 
>

+1 for the idea. The information will be useful at times for debugging purposes.

>
> Some of this type of information could strictly be fixed size, such as a fixed set of timestamps for the time at
whicha fixed set of things last occurred, or a fixed set of bits indicating whether a fixed set of things ever
happened.
>
> Some other types would be variable size, but hopefully short in practice, like a list of all postgres versions that
haveever been run on the cluster. 
>
> Logging the information via the usual log mechanism seems insufficient, as log files may get rotated and this
informationlost. 
>

True. Just a thought, can we use existing logging mechanism and APIs
to write to a new file that never gets rotated by the syslogger(Of
course, we need to think of the maximum file size that's allowed)? The
idea is like this: we use elog/ereport and so on with a new debug
level, when specified, instead of logging into the standard log files,
we log it to the new file.

>
> Would it be acceptable to store some fixed set of flag bits and timestamps in pg_control?  Space there is at a
premium.
>

Since we allocate ControlFileData in shared memory and also we may
have some data with timestamps, variable texts and so on, having this
included in pg_control data structure would not seem a good idea to
me.

>
> Would it make sense to alternately, or additionally, store some of this information in a flat text file in pg_data,
saya new file named "cluster_history" or such? 
>

IMHO, this is also a good idea. We need to think of the APIs to
open/read/write/close that history file? How often and which processes
and what type of data they write? Is it that the postmaster alone will
write into that file? If multiple processes are allowed to write, how
to deal with concurrent writers? Will users have to open manually and
read that file? or Will we have some program similar to
pg_controldata? Will we have some maximum limit to the size of this
file?

>
> I'm happy to put together a more concrete proposal, but solicit your opinions first on the merits of the idea
generally,and if you think the idea good, on the specifics you'd like to see included. 
>

Welcome to know more about this idea.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



Re: Tracking cluster upgrade and configuration history

From
Ian Lawrence Barwick
Date:
2020年11月16日(月) 15:48 Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>:
>
> On Thu, Nov 12, 2020 at 4:01 AM Mark Dilger
> <mark.dilger@enterprisedb.com> wrote:
> >
> > While supporting customers, it would frequently be useful to have more information about the history of a cluster.
Forexample, which prior versions were ever installed and running on the cluster?  Has the cluster ever been run with
fsync=off? When did the server last enter recovery, if ever?  Was a backup_label file present at that time? 
> >
>
> +1 for the idea. The information will be useful at times for debugging purposes.

It's certainly something which would be nice to have.

> > Would it make sense to alternately, or additionally, store some of this information in a flat text file in pg_data,
saya new file named "cluster_history" or such? 
> >
>
> IMHO, this is also a good idea. We need to think of the APIs to
> open/read/write/close that history file? How often and which processes
> and what type of data they write? Is it that the postmaster alone will
> write into that file? If multiple processes are allowed to write, how
> to deal with concurrent writers? Will users have to open manually and
> read that file? or Will we have some program similar to
> pg_controldata? Will we have some maximum limit to the size of this
> file?

pg_stat_statements might be worth looking at as one way of handling that kind
of file.

However the problem with keeping a separate file which is not WAL-logged would
mean it doesn't get propagated to standbys, and there's also the question
of how it could be maintained across upgrades via pg_upgrade.

FWIW I did once create a background worker extension [1] which logs
configuration changes to a table, though it's not particularly maintained or
recommended for production use.

[1] https://github.com/ibarwick/config_log


Regards

Ian Barwick
--
EnterpriseDB: https://www.enterprisedb.com



Re: Tracking cluster upgrade and configuration history

From
Mark Dilger
Date:

> On Nov 15, 2020, at 10:47 PM, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Thu, Nov 12, 2020 at 4:01 AM Mark Dilger
> <mark.dilger@enterprisedb.com> wrote:
>>
>> While supporting customers, it would frequently be useful to have more information about the history of a cluster.
Forexample, which prior versions were ever installed and running on the cluster?  Has the cluster ever been run with
fsync=off? When did the server last enter recovery, if ever?  Was a backup_label file present at that time? 
>>
>
> +1 for the idea. The information will be useful at times for debugging purposes.

Thanks for the feedback.

>
>>
>> Some of this type of information could strictly be fixed size, such as a fixed set of timestamps for the time at
whicha fixed set of things last occurred, or a fixed set of bits indicating whether a fixed set of things ever
happened.
>>
>> Some other types would be variable size, but hopefully short in practice, like a list of all postgres versions that
haveever been run on the cluster. 
>>
>> Logging the information via the usual log mechanism seems insufficient, as log files may get rotated and this
informationlost. 
>>
>
> True. Just a thought, can we use existing logging mechanism and APIs
> to write to a new file that never gets rotated by the syslogger(Of
> course, we need to think of the maximum file size that's allowed)? The
> idea is like this: we use elog/ereport and so on with a new debug
> level, when specified, instead of logging into the standard log files,
> we log it to the new file.

That's going in a very different direction from what I had in mind.  I was imagining something like a single binary or
textfile of either fixed size or something very short but not fixed.  The "very short but not fixed" part of that seems
abit too hand-waving on reflection.  Any variable length list, such as the list of postgres versions started on the
cluster,could be made fixed length by only tracking the most recent N of them, perhaps with a flag bit to indicate if
thelist has overflowed. 

Using elog/ereport with a new log level that gets directed into a different log file is an interesting idea, but it is
notclear how to use elog/ereport in a principled way to write files that need never get too large. 

>> Would it be acceptable to store some fixed set of flag bits and timestamps in pg_control?  Space there is at a
premium.
>>
>
> Since we allocate ControlFileData in shared memory and also we may
> have some data with timestamps, variable texts and so on, having this
> included in pg_control data structure would not seem a good idea to
> me.

Variable length texts seem completely out of scope for this.  I would expect the data to be a collection of integer
typesand flag bits.  Fixed length text might also be possible, but I don't have any examples in mind of text that we'd
wantto track. 

>> Would it make sense to alternately, or additionally, store some of this information in a flat text file in pg_data,
saya new file named "cluster_history" or such? 
>>
>
> IMHO, this is also a good idea. We need to think of the APIs to
> open/read/write/close that history file? How often and which processes
> and what type of data they write? Is it that the postmaster alone will
> write into that file? If multiple processes are allowed to write, how
> to deal with concurrent writers? Will users have to open manually and
> read that file? or Will we have some program similar to
> pg_controldata? Will we have some maximum limit to the size of this
> file?

This depends in part on feedback about which information others on this list would like to see included, but I was
imaginingsomething similar to how pg_control works, or using pg_control itself.  The maximum size for pg_control is 512
bytes,and on my system sizeof(ControlFileData) = 296, which leaves 216 bytes free.  I didn't check how much that might
changeon systems with different alignments.  We could either use some of the ~200 bytes currently available in
pg_control,or use another file, "pg_history" or such, following the design pattern already used for pg_control. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Tracking cluster upgrade and configuration history

From
Mark Dilger
Date:

> On Nov 15, 2020, at 11:23 PM, Ian Lawrence Barwick <barwick@gmail.com> wrote:
>
> 2020年11月16日(月) 15:48 Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>:
>>
>> On Thu, Nov 12, 2020 at 4:01 AM Mark Dilger
>> <mark.dilger@enterprisedb.com> wrote:
>>>
>>> While supporting customers, it would frequently be useful to have more information about the history of a cluster.
Forexample, which prior versions were ever installed and running on the cluster?  Has the cluster ever been run with
fsync=off? When did the server last enter recovery, if ever?  Was a backup_label file present at that time? 
>>>
>>
>> +1 for the idea. The information will be useful at times for debugging purposes.
>
> It's certainly something which would be nice to have.

Thanks for the feedback.

>>> Would it make sense to alternately, or additionally, store some of this information in a flat text file in pg_data,
saya new file named "cluster_history" or such? 
>>>
>>
>> IMHO, this is also a good idea. We need to think of the APIs to
>> open/read/write/close that history file? How often and which processes
>> and what type of data they write? Is it that the postmaster alone will
>> write into that file? If multiple processes are allowed to write, how
>> to deal with concurrent writers? Will users have to open manually and
>> read that file? or Will we have some program similar to
>> pg_controldata? Will we have some maximum limit to the size of this
>> file?
>
> pg_stat_statements might be worth looking at as one way of handling that kind
> of file.
>
> However the problem with keeping a separate file which is not WAL-logged would
> mean it doesn't get propagated to standbys, and there's also the question
> of how it could be maintained across upgrades via pg_upgrade.

Hmmm.  I was not expecting the file to be propagated to standbys.  The information could legitimately be different for
aprimary and a standby.  As a very simple example, there may be a flag bit for whether the cluster has operated as a
standby. That does raise questions about what sort of information about a primary that a standby should track, in case
theyget promoted to primary and information about the old primary would be useful for troubleshooting.  Ideas
welcome....

>
> FWIW I did once create a background worker extension [1] which logs
> configuration changes to a table, though it's not particularly maintained or
> recommended for production use.

I'm happy to change course if the consensus on the list favors using something larger, like log files or logging to a
table,but for now I'm still thinking about this in terms of something smaller than that. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company