Re: Catalog version access - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Catalog version access
Date
Msg-id Ye94sDRp0AoPV0H2@paquier.xyz
Whole thread Raw
In response to Re: Catalog version access  ("Bossart, Nathan" <bossartn@amazon.com>)
Responses Re: Catalog version access
List pgsql-hackers
On Mon, Jan 24, 2022 at 08:40:08PM +0000, Bossart, Nathan wrote:
> On 1/23/22, 7:31 PM, "Michael Paquier" <michael@paquier.xyz> wrote:
>> On Mon, Aug 16, 2021 at 06:12:54PM +0000, Bossart, Nathan wrote:
>>> I was looking at the --check switch for the postgres binary recently
>>> [0], and this sounds like something that might fit in nicely there.
>>> In the attached patch, --check will also check the control file if one
>>> exists.
>>
>> This would not work on a running postmaster as CreateDataDirLockFile()
>> is called beforehand, but we want this capability, no?
>
> I was not under the impression this was a requirement, based on the
> use-case discussed upthread [0].

Hmm.  I got a different impression as of this one:
https://www.postgresql.org/message-id/3496407.1613955241@sss.pgh.pa.us
But I can see downthread that this is not the case.  Sorry for the
confusion.

>> Abusing a bootstrap option for this purpose does not look like a good
>> idea, to be honest, especially for something only used internally now
>> and undocumented, but we want to use something aimed at an external
>> use with some documentation.  Using a separate switch would be more
>> adapted IMO.
>
> This is the opposite of what Magnus proposed earlier [1].  Do we need
> a new pg_ctl option for this?  It seems like it is really only
> intended for use by PostgreSQL developers, but perhaps there are other
> use-cases I am not thinking of.  In any case, the pg_ctl option would
> probably end up using --check (or another similar mode) behind the
> scenes.

Based on the latest state of the thread, I am understanding that we
don't want a new option for pg_ctl for this feature, and using a
bootstrap's --check for this purpose is not a good idea IMO.  What I
guess from Magnus' suggestion would be to add a completely different
switch.

Once you remove the requirement of a running server, we have basically
what has been recently implemented with postgres -C for
runtime-computed GUCs, because we already go through a read of the
control file to be able to print those GUCs with their correct
values.  This also means that it is already possible to check if a
data folder is compatible with a set of binaries with this facility,
as any postgres -C command with a runtime GUC would trigger this
check.  Using any of the existing runtime GUCs may be confusing, but
that would work.  And I am not really convinced that we have any need
to add a specific GUC for this purpose, be it a sort of
is_controlfile_valid or controlfile_checksum (CRC32 of the control
file).

>> Also, I think that we should be careful with the read of
>> the control file to avoid false positives?   We can rely on an atomic
>> read/write thanks to its maximum size of 512 bytes, but this looks
>> like a lot what has been done recently with postgres -C for runtime
>> GUCs, that *require* a read of the control file before grabbing the
>> values we are interested in.
>
> Sorry, I'm not following this one.  In my proposed patch, the control
> file (if one exists) is read after CreateDataDirLockFile(), just like
> PostmasterMain().

While looking at the patch, I was thinking about the fact that we may
want to support the case even if a server is running.  If we don't, my
worries about the concurrent control file activities are moot.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: "Shinoda, Noriyoshi (PN Japan FSIP)"
Date:
Subject: RE: refactoring basebackup.c
Next
From: Dilip Kumar
Date:
Subject: Re: autovacuum prioritization