Thread: archive modules

archive modules

From
"Bossart, Nathan"
Date:
This thread is a continuation of the thread with the subject
"parallelizing the archiver" [0].  That thread had morphed into an
effort to allow creating archive modules, so I've created a new one to
ensure that this topic has the proper visibility.

I've attached the latest patch from the previous thread.  This patch
does a few things.  First, it adds the archive_library GUC that
specifies a library to use in place of archive_command.  If
archive_library is set to "shell" (the default), archive_command is
still used.  The archive_library is preloaded, so its _PG_init() can
do anything that libraries loaded via shared_preload_libraries can do.
Like logical decoding output plugins, archive modules must define an
initialization function and some callbacks.  The patch also introduces
the basic_archive module to ensure test coverage on the new
infrastructure.

Nathan

[0] https://www.postgresql.org/message-id/flat/BC4D6BB2-6976-4397-A417-A6A30EEDC63E%40amazon.com


Attachment

Re: archive modules

From
Fujii Masao
Date:

On 2021/11/02 3:54, Bossart, Nathan wrote:
> This thread is a continuation of the thread with the subject
> "parallelizing the archiver" [0].  That thread had morphed into an
> effort to allow creating archive modules, so I've created a new one to
> ensure that this topic has the proper visibility.

What is the main motivation of this patch? I was thinking that
it's for parallelizing WAL archiving. But as far as I read
the patch very briefly, WAL file name is still passed to
the archive callback function one by one.

Are you planning to extend this mechanism to other WAL
archiving-related commands like restore_command? I can imagine
that those who use archive library (rather than shell) would
like to use the same mechanism for WAL restore.


> I've attached the latest patch from the previous thread.  This patch
> does a few things.  First, it adds the archive_library GUC that
> specifies a library to use in place of archive_command.  If
> archive_library is set to "shell" (the default), archive_command is
> still used.  The archive_library is preloaded, so its _PG_init() can
> do anything that libraries loaded via shared_preload_libraries can do.
> Like logical decoding output plugins, archive modules must define an
> initialization function and some callbacks.  The patch also introduces
> the basic_archive module to ensure test coverage on the new
> infrastructure.

I think that it's worth adding this module into core
rather than handling it as test module. It provides very basic
WAL archiving feature, but (I guess) it's enough for some users.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: archive modules

From
Michael Paquier
Date:
On Tue, Nov 02, 2021 at 01:43:54PM +0900, Fujii Masao wrote:
> On 2021/11/02 3:54, Bossart, Nathan wrote:
>> This thread is a continuation of the thread with the subject
>> "parallelizing the archiver" [0].  That thread had morphed into an
>> effort to allow creating archive modules, so I've created a new one to
>> ensure that this topic has the proper visibility.
>
> What is the main motivation of this patch? I was thinking that
> it's for parallelizing WAL archiving. But as far as I read
> the patch very briefly, WAL file name is still passed to
> the archive callback function one by one.

It seems to me that this patch is not moving into the right direction
implementation-wise (I have read the arguments about
backward-compatibility that led to the introduction of archive_library
and its shell mode), for what looks like a duplicate of
shared_preload_libraries but for an extra code path dedicated to the
archiver, where we could just have a hook instead?  We have been
talking for some time now to make the archiver process more
bgworker-ish, so as we finish with something closer to what the
logical replication launcher is.
--
Michael

Attachment

Re: archive modules

From
"Bossart, Nathan"
Date:
On 11/1/21, 9:44 PM, "Fujii Masao" <masao.fujii@oss.nttdata.com> wrote:
> What is the main motivation of this patch? I was thinking that
> it's for parallelizing WAL archiving. But as far as I read
> the patch very briefly, WAL file name is still passed to
> the archive callback function one by one.

The main motivation is provide a way to archive without shelling out.
This reduces the amount of overhead, which can improve archival rate
significantly.  It should also make it easier to archive more safely.
For example, many of the common shell commands used for archiving
won't fsync the data, but it isn't too hard to do so via C.  The
current proposal doesn't introduce any extra infrastructure for
batching or parallelism, but it is probably still possible.  I would
like to eventually add batching, but for now I'm only focused on
introducing basic archive module support.

> Are you planning to extend this mechanism to other WAL
> archiving-related commands like restore_command? I can imagine
> that those who use archive library (rather than shell) would
> like to use the same mechanism for WAL restore.

I would like to do this eventually, but my current proposal is limited
to archive_command.

> I think that it's worth adding this module into core
> rather than handling it as test module. It provides very basic
> WAL archiving feature, but (I guess) it's enough for some users.

Do you think it should go into contrib?

Nathan


Re: archive modules

From
Robert Haas
Date:
On Tue, Nov 2, 2021 at 1:24 AM Michael Paquier <michael@paquier.xyz> wrote:
> It seems to me that this patch is not moving into the right direction
> implementation-wise (I have read the arguments about
> backward-compatibility that led to the introduction of archive_library
> and its shell mode), for what looks like a duplicate of
> shared_preload_libraries but for an extra code path dedicated to the
> archiver, where we could just have a hook instead?  We have been
> talking for some time now to make the archiver process more
> bgworker-ish, so as we finish with something closer to what the
> logical replication launcher is.

Why in the world would you want a plain hook rather than something
closer to the way logical replication works?

Plain hooks are annoying to use. If you load things at the wrong time,
it silently doesn't work. It's also impossible to unload anything. If
you want to change to a different module, you probably have to bounce
the whole server instead of just changing the GUC and letting it load
the new module when you run 'pg_ctl reload'.

Blech! :-)

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: archive modules

From
"Bossart, Nathan"
Date:
I've just realized I forgot to CC the active participants on the last
thread to this one, so I've attempted to do that now.  I didn't
intentionally leave anyone out, but I'm sorry if I missed someone.

On 11/1/21, 10:24 PM, "Michael Paquier" <michael@paquier.xyz> wrote:
> It seems to me that this patch is not moving into the right direction
> implementation-wise (I have read the arguments about
> backward-compatibility that led to the introduction of archive_library
> and its shell mode), for what looks like a duplicate of
> shared_preload_libraries but for an extra code path dedicated to the
> archiver, where we could just have a hook instead?  We have been
> talking for some time now to make the archiver process more 
> bgworker-ish, so as we finish with something closer to what the 
> logical replication launcher is.

Hm.  It sounds like you want something more like what was discussed
earlier in the other thread [0].  This approach would basically just
add a hook and call it a day.  My patch for this approach [1] moved
the shell archive logic to a test module, but the general consensus
seems to be that we'd need to have it be present in core (and the
default).

Nathan

[0] https://postgr.es/m/8B7BF404-29D4-4662-A2DF-1AC4C98463EB%40amazon.com
[1] https://postgr.es/m/attachment/127385/v2-0001-Replace-archive_command-with-a-hook.patch


Re: archive modules

From
"Bossart, Nathan"
Date:
On 11/2/21, 8:11 AM, "Robert Haas" <robertmhaas@gmail.com> wrote:
> Why in the world would you want a plain hook rather than something
> closer to the way logical replication works?
>
> Plain hooks are annoying to use. If you load things at the wrong time,
> it silently doesn't work. It's also impossible to unload anything. If
> you want to change to a different module, you probably have to bounce
> the whole server instead of just changing the GUC and letting it load
> the new module when you run 'pg_ctl reload'.

Well, the current patch does require a reload since the modules are
preloaded, but maybe there is some way to avoid that.  However, I
agree with the general argument that a dedicated archive module
interface will be easier to use correctly.  With a hook, it could be
hard to know which library's archive function we are actually using.
With archive_library, we know the exact library's callback functions
we are using.

I think an argument for just adding a hook is that it's simpler and
easier to maintain.  But continuous archiving is a pretty critical
piece of functionality, so I think it's a great candidate for some
sturdy infrastructure.

Nathan


Re: archive modules

From
Robert Haas
Date:
On Tue, Nov 2, 2021 at 11:26 AM Bossart, Nathan <bossartn@amazon.com> wrote:
> Well, the current patch does require a reload since the modules are
> preloaded, but maybe there is some way to avoid that.

I think we could set things up so that if the value changes, you call
a shutdown hook for the old library, load the new one, and call any
startup hook for that one.

That wouldn't work with a hook-based approach.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: archive modules

From
"Bossart, Nathan"
Date:
On 11/2/21, 8:57 AM, "Robert Haas" <robertmhaas@gmail.com> wrote:
> On Tue, Nov 2, 2021 at 11:26 AM Bossart, Nathan <bossartn@amazon.com> wrote:
>> Well, the current patch does require a reload since the modules are
>> preloaded, but maybe there is some way to avoid that.
>
> I think we could set things up so that if the value changes, you call
> a shutdown hook for the old library, load the new one, and call any
> startup hook for that one.

Yes, that seems doable.  My point is that I've intentionally chosen to
preload the libraries at the moment so that it's possible to define
PGC_POSTMASTER GUCs and to use RegisterBackgroundWorker().  If we
think that switching archive modules without restarting is more
important, I believe we will need to take on a few restrictions.

Nathan


Re: archive modules

From
Robert Haas
Date:
On Tue, Nov 2, 2021 at 12:10 PM Bossart, Nathan <bossartn@amazon.com> wrote:
> Yes, that seems doable.  My point is that I've intentionally chosen to
> preload the libraries at the moment so that it's possible to define
> PGC_POSTMASTER GUCs and to use RegisterBackgroundWorker().  If we
> think that switching archive modules without restarting is more
> important, I believe we will need to take on a few restrictions.

I guess I'm failing to understand what the problem is. You can set
GUCs of the form foo.bar in postgresql.conf anyway, right?

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: archive modules

From
"Bossart, Nathan"
Date:
On 11/2/21, 9:17 AM, "Robert Haas" <robertmhaas@gmail.com> wrote:
> On Tue, Nov 2, 2021 at 12:10 PM Bossart, Nathan <bossartn@amazon.com> wrote:
>> Yes, that seems doable.  My point is that I've intentionally chosen to
>> preload the libraries at the moment so that it's possible to define
>> PGC_POSTMASTER GUCs and to use RegisterBackgroundWorker().  If we
>> think that switching archive modules without restarting is more
>> important, I believe we will need to take on a few restrictions.
>
> I guess I'm failing to understand what the problem is. You can set
> GUCs of the form foo.bar in postgresql.conf anyway, right?

I must not be explaining it well, sorry.  I'm mainly thinking about
the following code snippets.

In guc.c:
        /*
         * Only allow custom PGC_POSTMASTER variables to be created during shared
         * library preload; any later than that, we can't ensure that the value
         * doesn't change after startup.  This is a fatal elog if it happens; just
         * erroring out isn't safe because we don't know what the calling loadable
         * module might already have hooked into.
         */
        if (context == PGC_POSTMASTER &&
                !process_shared_preload_libraries_in_progress)
                elog(FATAL, "cannot create PGC_POSTMASTER variables after startup");

In bgworker.c:
        if (!process_shared_preload_libraries_in_progress &&
                strcmp(worker->bgw_library_name, "postgres") != 0)
        {
                if (!IsUnderPostmaster)
                        ereport(LOG,
                                        (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                                         errmsg("background worker \"%s\": must be registered in
shared_preload_libraries",
                                                        worker->bgw_name)));
                return;
        }

You could still introduce GUCs in _PG_init(), but they couldn't be
defined as PGC_POSTMASTER.  Also, you could still use
RegisterDynamicBackgroundWorker() to register a background worker, but
you couldn't use RegisterBackgroundWorker().  These might be
acceptable restrictions if swapping archive libraries on the fly seems
more important, but I wanted to bring that front and center to make
sure everyone understands the tradeoffs.

It's also entirely possible I'm misunderstanding something here...

Nathan


Re: archive modules

From
Robert Haas
Date:
On Tue, Nov 2, 2021 at 12:39 PM Bossart, Nathan <bossartn@amazon.com> wrote:>
> On 11/2/21, 9:17 AM, "Robert Haas" <robertmhaas@gmail.com> wrote:
> You could still introduce GUCs in _PG_init(), but they couldn't be
> defined as PGC_POSTMASTER.

It seems like PGC_POSTMASTER isn't very desirable anyway. Wouldn't you
want PGC_SIGHUP? I mean I'm not saying there couldn't be a case where
that wouldn't work, because you could need a big chunk of shared
memory allocated at startup time or something. But in that's probably
not typical, and if it does happen well then that particular archive
module has to be preloaded. If you know that you have several archive
modules that you want to use, you can still preload them all if for
this or any other reason you want to do that. But in a lot of cases
it's not going to be necessary.

In other words, if we use hooks, then you're guaranteed to need a
server restart to change anything. If we use something like what you
have now, there can be corner cases where you need that or benefits of
preloading, but it's not a hard requirement, and in many cases you can
get by without it. That seems strictly better to me ... but maybe I'm
still confused.

> Also, you could still use
> RegisterDynamicBackgroundWorker() to register a background worker, but
> you couldn't use RegisterBackgroundWorker().  These might be
> acceptable restrictions if swapping archive libraries on the fly seems
> more important, but I wanted to bring that front and center to make
> sure everyone understands the tradeoffs.

RegisterDynamicBackgroundWorker() seems way better, though. It's hard
for me to understand why this would be a problem for anybody. And
again, if somebody does have that need, they can always fall back to
saying that their particular module should be preloaded if you want to
use it.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: archive modules

From
"Bossart, Nathan"
Date:
On 11/2/21, 9:46 AM, "Robert Haas" <robertmhaas@gmail.com> wrote:
> On Tue, Nov 2, 2021 at 12:39 PM Bossart, Nathan <bossartn@amazon.com> wrote:>
>> On 11/2/21, 9:17 AM, "Robert Haas" <robertmhaas@gmail.com> wrote:
>> You could still introduce GUCs in _PG_init(), but they couldn't be
>> defined as PGC_POSTMASTER.
>
> It seems like PGC_POSTMASTER isn't very desirable anyway. Wouldn't you
> want PGC_SIGHUP? I mean I'm not saying there couldn't be a case where
> that wouldn't work, because you could need a big chunk of shared
> memory allocated at startup time or something. But in that's probably
> not typical, and if it does happen well then that particular archive
> module has to be preloaded. If you know that you have several archive
> modules that you want to use, you can still preload them all if for
> this or any other reason you want to do that. But in a lot of cases
> it's not going to be necessary.
>
> In other words, if we use hooks, then you're guaranteed to need a
> server restart to change anything. If we use something like what you
> have now, there can be corner cases where you need that or benefits of
> preloading, but it's not a hard requirement, and in many cases you can
> get by without it. That seems strictly better to me ... but maybe I'm
> still confused.
>
>> Also, you could still use
>> RegisterDynamicBackgroundWorker() to register a background worker, but
>> you couldn't use RegisterBackgroundWorker().  These might be
>> acceptable restrictions if swapping archive libraries on the fly seems
>> more important, but I wanted to bring that front and center to make
>> sure everyone understands the tradeoffs.
>
> RegisterDynamicBackgroundWorker() seems way better, though. It's hard
> for me to understand why this would be a problem for anybody. And
> again, if somebody does have that need, they can always fall back to
> saying that their particular module should be preloaded if you want to
> use it.

I agree.  I'll make sure the archive library can be changed via SIGHUP
in the next revision.

Nathan


Re: archive modules

From
"Bossart, Nathan"
Date:
On 11/2/21, 10:29 AM, "Bossart, Nathan" <bossartn@amazon.com> wrote:
> I agree.  I'll make sure the archive library can be changed via SIGHUP
> in the next revision.

And here it is.

Nathan


Attachment

Re: archive modules

From
Fujii Masao
Date:

On 2021/11/03 0:03, Bossart, Nathan wrote:
> On 11/1/21, 9:44 PM, "Fujii Masao" <masao.fujii@oss.nttdata.com> wrote:
>> What is the main motivation of this patch? I was thinking that
>> it's for parallelizing WAL archiving. But as far as I read
>> the patch very briefly, WAL file name is still passed to
>> the archive callback function one by one.
> 
> The main motivation is provide a way to archive without shelling out.
> This reduces the amount of overhead, which can improve archival rate
> significantly.

It's helpful if you share how much this approach reduces
the amount of overhead.


> It should also make it easier to archive more safely.
> For example, many of the common shell commands used for archiving
> won't fsync the data, but it isn't too hard to do so via C.

But probably we can do the same thing even by using the existing
shell interface? For example, we can implement and provide
the C program of the archive command that fsync's the file?
Users can just use it in archive_command.


> The
> current proposal doesn't introduce any extra infrastructure for
> batching or parallelism, but it is probably still possible.  I would
> like to eventually add batching, but for now I'm only focused on
> introducing basic archive module support.

Understood. I agree that it's reasonable to implement them gradually.


>> Are you planning to extend this mechanism to other WAL
>> archiving-related commands like restore_command? I can imagine
>> that those who use archive library (rather than shell) would
>> like to use the same mechanism for WAL restore.
> 
> I would like to do this eventually, but my current proposal is limited
> to archive_command.

Understood.

  
>> I think that it's worth adding this module into core
>> rather than handling it as test module. It provides very basic
>> WAL archiving feature, but (I guess) it's enough for some users.
> 
> Do you think it should go into contrib?

Yes, at least for me..

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: archive modules

From
David Steele
Date:
On 11/7/21 1:04 AM, Fujii Masao wrote:
> 
> On 2021/11/03 0:03, Bossart, Nathan wrote:
>> On 11/1/21, 9:44 PM, "Fujii Masao" <masao.fujii@oss.nttdata.com> wrote:
>>> What is the main motivation of this patch? I was thinking that
>>> it's for parallelizing WAL archiving. But as far as I read
>>> the patch very briefly, WAL file name is still passed to
>>> the archive callback function one by one.
>>
>> The main motivation is provide a way to archive without shelling out.
>> This reduces the amount of overhead, which can improve archival rate
>> significantly.
> 
> It's helpful if you share how much this approach reduces
> the amount of overhead.

FWIW we have a test for this in pgBackRest. Running 
`archive_command=pgbackrest archive-push ...` 1000 times via system() 
yields an average of 3ms per execution. pgBackRest reports ~1ms of time 
here so the system() overhead is ~2ms. These times are on my very fast 
workstation and in my experience servers are quite a bit slower.

This doesn't tell the entire story, though, because in this test 
pgBackRest is just checking notifications being returned by an async 
process that was spawned earlier. This complexity exists to save the 
startup costs of, e.g. establishing an SSH connection, which is often > 
1 second.

This module would make it far easier to pay those startup costs a single 
time, or at least only occasionally, making it possible to write 
performant archivers with less complexity than is currently possible.

>> It should also make it easier to archive more safely.
>> For example, many of the common shell commands used for archiving
>> won't fsync the data, but it isn't too hard to do so via C.
> 
> But probably we can do the same thing even by using the existing
> shell interface? For example, we can implement and provide
> the C program of the archive command that fsync's the file?
> Users can just use it in archive_command.

It is far more common to be writing WAL segments to another host or 
object storage. In either case I believe a local fsync file command is 
not very useful.

>>> I think that it's worth adding this module into core
>>> rather than handling it as test module. It provides very basic
>>> WAL archiving feature, but (I guess) it's enough for some users.
>>
>> Do you think it should go into contrib?

I would prefer this module to be in core as our standard implementation 
and load by default in a vanilla install.

Regards,
-- 
-David
david@pgmasters.net



Re: archive modules

From
"Bossart, Nathan"
Date:
On 11/10/21, 8:10 AM, "David Steele" <david@pgmasters.net> wrote:
> On 11/7/21 1:04 AM, Fujii Masao wrote:
>> It's helpful if you share how much this approach reduces
>> the amount of overhead.
>
> FWIW we have a test for this in pgBackRest. Running
> `archive_command=pgbackrest archive-push ...` 1000 times via system()
> yields an average of 3ms per execution. pgBackRest reports ~1ms of time
> here so the system() overhead is ~2ms. These times are on my very fast
> workstation and in my experience servers are quite a bit slower.

In the previous thread [0], I noted a 50% speedup for a basic
archiving strategy that involved copying the file to a different
directory.

> I would prefer this module to be in core as our standard implementation
> and load by default in a vanilla install.

Hm.  I think I disagree with putting it in contrib and with making it
the default archive library.  The first reason is backward
compatibility.  There has already been quite a bit of discussion about
this, and I don't see how we can get away with anything except for
maintaining the existing behavior for now.  Maybe we could move to a
better default down the road, but I'm hesitant to press that issue too
much at the moment.

The second reason is that the basic_archive module has a couple of
deficiencies.  For example, it doesn't handle inconvenient server
crashes well (e.g., after archiving but before we've renamed the
.ready file).  A way to fix this might be to compare the archive file
with the to-be-archived file and to succeed if they are exactly the
same.  Another problem is that there is no handling for multiple
servers using basic_archive to write WAL to the same location.  This
is because basic_archive first copies data to a temporary file that is
always named "archtemp."  This might be fixed by appending a random
string to the temporary file or by locking it somehow, but there are
still a few things left to figure out.

I think it'd be awesome to eventually fix all these issues in
basic_archive and to recommend it as a proper archiving strategy, but
I'm worried that this will introduce a significant amount of
complexity to this patch.  I really only intended for basic_archive to
be used for testing and to demonstrate that it's possible use the
archive module infrastructure to do something useful.  If folks really
want it in contrib, I'd at least add a big warning about the
aforementioned problems in its docs.

Nathan

[0] https://postgr.es/m/E9035E94-EC76-436E-B6C9-1C03FBD8EF54%40amazon.com


Re: archive modules

From
David Steele
Date:
On 11/10/21 1:22 PM, Bossart, Nathan wrote:
> On 11/10/21, 8:10 AM, "David Steele" <david@pgmasters.net> wrote:
> 
>> I would prefer this module to be in core as our standard implementation
>> and load by default in a vanilla install.
> 
> Hm.  I think I disagree with putting it in contrib and with making it
> the default archive library.  The first reason is backward
> compatibility.  There has already been quite a bit of discussion about
> this, and I don't see how we can get away with anything except for
> maintaining the existing behavior for now.  Maybe we could move to a
> better default down the road, but I'm hesitant to press that issue too
> much at the moment.

OK, I haven't had to go over the patch in detail so I didn't realize the 
module was not backwards compatible. I'll have a closer look soon.

> The second reason is that the basic_archive module has a couple of
> deficiencies.  For example, it doesn't handle inconvenient server
> crashes well (e.g., after archiving but before we've renamed the
> .ready file).  A way to fix this might be to compare the archive file
> with the to-be-archived file and to succeed if they are exactly the
> same.  Another problem is that there is no handling for multiple
> servers using basic_archive to write WAL to the same location.  This
> is because basic_archive first copies data to a temporary file that is
> always named "archtemp."  This might be fixed by appending a random
> string to the temporary file or by locking it somehow, but there are
> still a few things left to figure out.

Honestly, I'm not sure to what extent it makes sense to delve into these 
problems for an archiver that basically just copies to another 
directory. This is a not a very realistic solution for the common 
storage requirements we are seeing these days.

> I think it'd be awesome to eventually fix all these issues in
> basic_archive and to recommend it as a proper archiving strategy, but
> I'm worried that this will introduce a significant amount of
> complexity to this patch.  I really only intended for basic_archive to
> be used for testing and to demonstrate that it's possible use the
> archive module infrastructure to do something useful.  If folks really
> want it in contrib, I'd at least add a big warning about the
> aforementioned problems in its docs.

I'll have more to say once I've had a closer look, but in general I 
agree with what you have said here. Keeping it in test for now is likely 
to be the best approach.

Regards,
-- 
-David
david@pgmasters.net



Re: archive modules

From
"Bossart, Nathan"
Date:
On 11/10/21, 10:42 AM, "David Steele" <david@pgmasters.net> wrote:
> OK, I haven't had to go over the patch in detail so I didn't realize the
> module was not backwards compatible. I'll have a closer look soon.

It's backward-compatible in the sense that you'd be able to switch
archive_library to "shell" to continue using archive_command, but
archive_command is otherwise unused.  The proposed patch sets
archive_library to "shell" by default.

> Honestly, I'm not sure to what extent it makes sense to delve into these
> problems for an archiver that basically just copies to another
> directory. This is a not a very realistic solution for the common
> storage requirements we are seeing these days.

Agreed.

> I'll have more to say once I've had a closer look, but in general I
> agree with what you have said here. Keeping it in test for now is likely
> to be the best approach.

Looking forward to your feedback.

Nathan


Re: archive modules

From
"Bossart, Nathan"
Date:
On 11/10/21, 10:53 AM, "Bossart, Nathan" <bossartn@amazon.com> wrote:
> Looking forward to your feedback.

Here is a rebased patch (beb4e9b broke v8).

Nathan


Attachment

Re: archive modules

From
"Bossart, Nathan"
Date:
CC'd a few folks who were on earlier messages in this thread but have
since been left out for whatever reason.

On 11/11/21, 3:02 PM, "Bossart, Nathan" <bossartn@amazon.com> wrote:
> Here is a rebased patch (beb4e9b broke v8).

I went ahead and split the patch into 4 separate patches in an effort
to ease review.  0001 just refactors the shell archiving logic to its
own file.  0002 introduces the archive modules infrastructure.  0003
introduces the basic_archive test module.  And 0004 is the docs.

Nathan


Attachment

Re: archive modules

From
"Bossart, Nathan"
Date:
On 11/19/21, 11:24 AM, "Bossart, Nathan" <bossartn@amazon.com> wrote:
> I went ahead and split the patch into 4 separate patches in an effort
> to ease review.  0001 just refactors the shell archiving logic to its
> own file.  0002 introduces the archive modules infrastructure.  0003
> introduces the basic_archive test module.  And 0004 is the docs.

Here is a rebased patch set (1b06d7b broke v10).

Nathan


Attachment

Re: archive modules

From
"Bossart, Nathan"
Date:
On 11/2/21, 8:07 AM, "Bossart, Nathan" <bossartn@amazon.com> wrote:
> The main motivation is provide a way to archive without shelling out.
> This reduces the amount of overhead, which can improve archival rate
> significantly.  It should also make it easier to archive more safely.
> For example, many of the common shell commands used for archiving
> won't fsync the data, but it isn't too hard to do so via C.  The
> current proposal doesn't introduce any extra infrastructure for
> batching or parallelism, but it is probably still possible.  I would
> like to eventually add batching, but for now I'm only focused on
> introducing basic archive module support.

As noted above, the latest patch set (v11) doesn't add any batching or
parallelism.  Now that beb4e9b is committed (which causes the archiver
to gather multiple files to archive in each scan of archive_status),
it seems like a good time to discuss this a bit further.  I think
there are some interesting design considerations.

As is, the current archive module infrastructure in the v11 patch set
should help reduce the amount of overhead per-file quite a bit, and I
observed a noticeable speedup with a basic file-copying archive
strategy (although this is likely not representative of real-world
workloads).  I believe it would be possible for archive module authors
to implement batching/parallelism, but AFAICT it would still require
hacks similar to what folks do today with archive_command.  For
example, you could look ahead in archive_status, archive a bunch of
files in a batch or in parallel with background workers, and then
quickly return true when the archive_library is called for later files
in the batch.

Alternatively, we could offer some kind of built-in batching support
in the archive module infrastructure.  One simple approach would be to
just have pgarch_readyXlog() optionally return the entire list of
files gathered from the directory scan of archive_status (presently up
to 64 files).  Or we could provide a GUC like archive_batch_size that
would allow users to limit how many files are sent to the
archive_library each time.  This list would be given to
pgarch_archiveXlog(), which would return which files were successfully
archived and which failed.  I think this could be done for
archive_command as well, although it might be tricky to determine
which files were archived successfully.  To handle that, we might just
need to fail the whole batch if the archive_command return value
indicates failure.

Another interesting change is that the special timeline file handling
added in beb4e9b becomes less useful.  Presently, if a timeline
history file is marked ready for archival, we force pgarch_readyXlog()
to do a new scan of archive_status the next time it is called in order
to pick it up as soon as possible (ordinarily it just returns the
files gathered in a previous scan until it runs out).  If we are
sending a list of files to the archive module, it will be more
difficult to ensure timeline history files are picked up so quickly.
Perhaps this is a reasonable tradeoff to make when archive batching is
enabled.

I think the retry logic can stay roughly the same.  If any files in a
batch cannot be archived, wait a second before retrying.  If that
happens a few times in a row, stop archiving for a bit.  It wouldn't
be quite as precise as what's there today because the failures could
be for different files each time, but I don't know if that is terribly
important.

Finally, I wonder if batching support is something we should bother
with at all for the first round of archive module support.  I believe
it is something that could be easily added later, although it might
require archive modules to adjust the archiving callback to accept and
return a list of files.  IMO the archive modules infrastructure is
still an improvement even without batching, and it seems to fit nicely
into the existing behavior of the archiver process.  I'm curious what
others think about all this.

Nathan


Re: archive modules

From
"Bossart, Nathan"
Date:
On 11/22/21, 10:01 AM, "Bossart, Nathan" <bossartn@amazon.com> wrote:
> On 11/19/21, 11:24 AM, "Bossart, Nathan" <bossartn@amazon.com> wrote:
>> I went ahead and split the patch into 4 separate patches in an effort
>> to ease review.  0001 just refactors the shell archiving logic to its
>> own file.  0002 introduces the archive modules infrastructure.  0003
>> introduces the basic_archive test module.  And 0004 is the docs.
>
> Here is a rebased patch set (1b06d7b broke v10).

I'm attempting to make cfbot happy again with v12.  It looked like
there was a missing #include for Windows.

Nathan


Attachment

Re: archive modules

From
"Bossart, Nathan"
Date:
On 1/5/22, 3:14 PM, "Bossart, Nathan" <bossartn@amazon.com> wrote:
> On 11/22/21, 10:01 AM, "Bossart, Nathan" <bossartn@amazon.com> wrote:
>> On 11/19/21, 11:24 AM, "Bossart, Nathan" <bossartn@amazon.com> wrote:
>>> I went ahead and split the patch into 4 separate patches in an effort
>>> to ease review.  0001 just refactors the shell archiving logic to its
>>> own file.  0002 introduces the archive modules infrastructure.  0003
>>> introduces the basic_archive test module.  And 0004 is the docs.
>>
>> Here is a rebased patch set (1b06d7b broke v10).
>
> I'm attempting to make cfbot happy again with v12.  It looked like
> there was a missing #include for Windows.

Here is another rebase for cfbot.

Nathan


Attachment

Re: archive modules

From
Robert Haas
Date:
On Thu, Jan 13, 2022 at 2:38 PM Bossart, Nathan <bossartn@amazon.com> wrote:
> Here is another rebase for cfbot.

I've committed 0001 now. I don't see anything particularly wrong with
the rest of this either, but here are a few comments:

- I wonder whether it might be better to promote the basic archiving
module to contrib (as compared with src/test/modules) and try to
harden it to the extent that such hardening is required. I think a lot
of people would get good use out of that. It might not be a completely
baked solution, but a solution doesn't have to be completely baked to
be a massive improvement over the stupidity endorsed by our current
documentation.

- I wonder whether it's a good idea to silently succeed if the file
exists and has the same contents as the file we're trying to archive.
ISTR that being necessary behavior for robustness, because what if we
archive the file and then die before recording the fact that we
archived it?

- If we load a new archive library, should we give the old one a
callback so it can shut down? And should the archiver considering
exiting since we can't unload? It isn't necessary but it might be
nicer.

- I believe we decided some time back to invoke function pointers
(*like)(this) rather than like(this) for clarity. It was judged that
something->like(this) was fine because that can only be a function
pointer, so no need to write (*(something->like))(this), but
like(this) could make you think that "like" is a plain function rather
than a function pointer.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: archive modules

From
Nathan Bossart
Date:
On Fri, Jan 28, 2022 at 02:06:50PM -0500, Robert Haas wrote:
> I've committed 0001 now. I don't see anything particularly wrong with
> the rest of this either, but here are a few comments:

Thanks!

> - I wonder whether it might be better to promote the basic archiving
> module to contrib (as compared with src/test/modules) and try to
> harden it to the extent that such hardening is required. I think a lot
> of people would get good use out of that. It might not be a completely
> baked solution, but a solution doesn't have to be completely baked to
> be a massive improvement over the stupidity endorsed by our current
> documentation.

This has been suggested a few times in this thread, so I'll go ahead and
move it to contrib.  I am clearly outnumbered! :)

I discussed the two main deficiencies I'm aware of with basic_archive
earlier [0].  The first one is the issue with "incovenient" server crashes
(mentioned below).  The second is that there is no handling for multiple
servers writing to the same location since the temporary file is always
named "archtemp."  I thought about a few ways to pick a unique file name
(or at least one that is _probably_ unique), but that began adding a lot of
complexity for something I intended as a test module.  I can spend some
more time on this if you think it's worth fixing for a contrib module.

> - I wonder whether it's a good idea to silently succeed if the file
> exists and has the same contents as the file we're trying to archive.
> ISTR that being necessary behavior for robustness, because what if we
> archive the file and then die before recording the fact that we
> archived it?

Yes.  The only reason I didn't proceed with this earlier is because the
logic became a sizable chunk of the module.  I will add this in the next
revision.

> - If we load a new archive library, should we give the old one a
> callback so it can shut down? And should the archiver considering
> exiting since we can't unload? It isn't necessary but it might be
> nicer.

Good idea.  I'll look into this.

> - I believe we decided some time back to invoke function pointers
> (*like)(this) rather than like(this) for clarity. It was judged that
> something->like(this) was fine because that can only be a function
> pointer, so no need to write (*(something->like))(this), but
> like(this) could make you think that "like" is a plain function rather
> than a function pointer.

Will fix.

[0] https://postgr.es/m/A30D8D33-8944-4898-BCA8-C77C18258247%40amazon.com

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com/



Re: archive modules

From
Robert Haas
Date:
On Fri, Jan 28, 2022 at 3:01 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> I discussed the two main deficiencies I'm aware of with basic_archive
> earlier [0].  The first one is the issue with "incovenient" server crashes
> (mentioned below).

Seems easy enough to rectify, if it's just a matter of silently-succeed-if-same.

> The second is that there is no handling for multiple
> servers writing to the same location since the temporary file is always
> named "archtemp."  I thought about a few ways to pick a unique file name
> (or at least one that is _probably_ unique), but that began adding a lot of
> complexity for something I intended as a test module.  I can spend some
> more time on this if you think it's worth fixing for a contrib module.

Well, my first thought was to wonder whether we even care about that
scenario, but I guess we probably do, at least a little bit.

How about:

1. Name temporary files like
archive_temp.${FINAL_NAME}.${PID}.${SOME_RANDOM_NUMBER}. Create them
with O_EXCL. If it fails, die.

2. Try not to leave files like this behind, perhaps installing an
on_proc_exit callback or similar, but accept that crashes and unlink()
failures will make it inevitable in some cases.

3. Document that crashes or other strange failure cases may leave
archive_temp.* files behind in the archive directory, and recommend
that users remove them before restarting the database after a crash
(or, with caution, removing them while the database is running if the
user is sure that the files are old and unrelated to any archiving
still in progress).

I'm not arguing that this is exactly the right idea. But I am arguing
that it shouldn't take a ton of engineering to come up with something
reasonable here.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: archive modules

From
Nathan Bossart
Date:
On Fri, Jan 28, 2022 at 03:20:41PM -0500, Robert Haas wrote:
> On Fri, Jan 28, 2022 at 3:01 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>> I discussed the two main deficiencies I'm aware of with basic_archive
>> earlier [0].  The first one is the issue with "incovenient" server crashes
>> (mentioned below).
> 
> Seems easy enough to rectify, if it's just a matter of silently-succeed-if-same.

Yes.

>> The second is that there is no handling for multiple
>> servers writing to the same location since the temporary file is always
>> named "archtemp."  I thought about a few ways to pick a unique file name
>> (or at least one that is _probably_ unique), but that began adding a lot of
>> complexity for something I intended as a test module.  I can spend some
>> more time on this if you think it's worth fixing for a contrib module.
> 
> Well, my first thought was to wonder whether we even care about that
> scenario, but I guess we probably do, at least a little bit.
> 
> How about:
> 
> 1. Name temporary files like
> archive_temp.${FINAL_NAME}.${PID}.${SOME_RANDOM_NUMBER}. Create them
> with O_EXCL. If it fails, die.
> 
> 2. Try not to leave files like this behind, perhaps installing an
> on_proc_exit callback or similar, but accept that crashes and unlink()
> failures will make it inevitable in some cases.
> 
> 3. Document that crashes or other strange failure cases may leave
> archive_temp.* files behind in the archive directory, and recommend
> that users remove them before restarting the database after a crash
> (or, with caution, removing them while the database is running if the
> user is sure that the files are old and unrelated to any archiving
> still in progress).
> 
> I'm not arguing that this is exactly the right idea. But I am arguing
> that it shouldn't take a ton of engineering to come up with something
> reasonable here.

This is roughly what I had in mind.  I'll give it a try.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
Nathan Bossart
Date:
Here is a new revision.  I've moved basic_archive to contrib, hardened it
as suggested, and added shutdown support for archive modules.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: archive modules

From
Nathan Bossart
Date:
On Sat, Jan 29, 2022 at 12:50:18PM -0800, Nathan Bossart wrote:
> Here is a new revision.  I've moved basic_archive to contrib, hardened it
> as suggested, and added shutdown support for archive modules.

cfbot was unhappy with v14, so here's another attempt.  One other change I
am pondering is surrounding pgarch_MainLoop() with PG_TRY/PG_FINALLY so
that we can also call the shutdown callback in the event of an ERROR.  This
might be necessary for an archive module that uses background workers.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: archive modules

From
Nathan Bossart
Date:
On Sat, Jan 29, 2022 at 04:31:48PM -0800, Nathan Bossart wrote:
> On Sat, Jan 29, 2022 at 12:50:18PM -0800, Nathan Bossart wrote:
>> Here is a new revision.  I've moved basic_archive to contrib, hardened it
>> as suggested, and added shutdown support for archive modules.
> 
> cfbot was unhappy with v14, so here's another attempt.  One other change I
> am pondering is surrounding pgarch_MainLoop() with PG_TRY/PG_FINALLY so
> that we can also call the shutdown callback in the event of an ERROR.  This
> might be necessary for an archive module that uses background workers.

Ugh.  Apologies for the noise.  cfbot still isn't happy, so here's yet
another attempt.  This new patch set also ensures the shutdown callback is
called when the archiver process exits.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: archive modules

From
Nathan Bossart
Date:
On Sat, Jan 29, 2022 at 09:01:41PM -0800, Nathan Bossart wrote:
> On Sat, Jan 29, 2022 at 04:31:48PM -0800, Nathan Bossart wrote:
>> On Sat, Jan 29, 2022 at 12:50:18PM -0800, Nathan Bossart wrote:
>>> Here is a new revision.  I've moved basic_archive to contrib, hardened it
>>> as suggested, and added shutdown support for archive modules.
>> 
>> cfbot was unhappy with v14, so here's another attempt.  One other change I
>> am pondering is surrounding pgarch_MainLoop() with PG_TRY/PG_FINALLY so
>> that we can also call the shutdown callback in the event of an ERROR.  This
>> might be necessary for an archive module that uses background workers.
> 
> Ugh.  Apologies for the noise.  cfbot still isn't happy, so here's yet
> another attempt.  This new patch set also ensures the shutdown callback is
> called when the archiver process exits.

If basic_archive is to be in contrib, we probably want to avoid restarting
the archiver every time the module ERRORs.  I debated trying to add a
generic exception handler that all archive modules could use, but I suspect
many will have unique cleanup requirements.  Plus, AFAICT restarting the
archiver isn't terrible, it just causes most of the normal retry logic to
be skipped.

I also looked into rewriting basic_archive to avoid ERRORs and return false
for all failures, but this was rather tedious.  Instead, I just introduced
a custom exception handler for basic_archive's archive callback.  This
allows us to ERROR as necessary (which helps ensure that failures show up
in the server logs), and the archiver can treat it like a normal failure
and avoid restarting.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: archive modules

From
Robert Haas
Date:
On Mon, Jan 31, 2022 at 8:36 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> If basic_archive is to be in contrib, we probably want to avoid restarting
> the archiver every time the module ERRORs.  I debated trying to add a
> generic exception handler that all archive modules could use, but I suspect
> many will have unique cleanup requirements.  Plus, AFAICT restarting the
> archiver isn't terrible, it just causes most of the normal retry logic to
> be skipped.
>
> I also looked into rewriting basic_archive to avoid ERRORs and return false
> for all failures, but this was rather tedious.  Instead, I just introduced
> a custom exception handler for basic_archive's archive callback.  This
> allows us to ERROR as necessary (which helps ensure that failures show up
> in the server logs), and the archiver can treat it like a normal failure
> and avoid restarting.

I think avoiding ERROR is going to be impractical. Catching it in the
contrib module seems OK. Catching it in the generic code is probably
also possible to do in a reasonable way. Not catching the error also
seems like it would be OK, since we expect errors to be infrequent.
I'm not objecting to anything you did here, but I'm uncertain why
adding basic_archive along shell_archive changes the calculus here in
any way. It just seems like a separate problem.

+       /* Perform logging of memory contexts of this process */
+       if (LogMemoryContextPending)
+               ProcessLogMemoryContextInterrupt();

Any special reason for moving this up higher? Not really an issue, just curious.

+                       gettext_noop("This is unused if
\"archive_library\" does not indicate archiving via shell is
enabled.")

This contains a double negative. We could describe it more positively:
This is used only if \"archive_library\" specifies archiving via
shell. But that's actually a little confusing, because the way you've
set it up, archiving via shell can be specified by writing either
archive_library = '' or archive_library = 'shell'. I don't see any
particularly good reason to allow that to be spelled in two ways.
Let's pick one. Then here we can write either:

(a) This is used only if archive_library = 'shell'.
-or-
(b) This is used only if archive_library is not set.

IMHO, either of those would be clearer than what you have right now,
and it would definitely be shorter.

+/*
+ * Callback that gets called to determine if the archive module is
+ * configured.
+ */
+typedef bool (*ArchiveCheckConfiguredCB) (void);
+
+/*
+ * Callback called to archive a single WAL file.
+ */
+typedef bool (*ArchiveFileCB) (const char *file, const char *path);
+
+/*
+ * Called to shutdown an archive module.
+ */
+typedef void (*ArchiveShutdownCB) (void);

I think that this is the wrong amount of comments. One theory is that
the reader should refer to the documentation to understand how these
callbacks work. In that case, having a separate comment for each one
that doesn't really say anything is just taking up space. It would be
better to have one comment for all three lines referring the reader to
the documentation. Alternatively, one could take the position that the
explanation should go into these comments, and then perhaps we don't
even really need documentation. A one-line comment that doesn't really
say anything non-obvious seems like the worst amount.

+ <warning>
+  <para>
+   There are considerable robustness and security risks in using
archive modules
+   because, being written in the <literal>C</literal> language, they
have access
+   to many server resources.  Administrators wishing to enable archive modules
+   should exercise extreme caution.  Only carefully audited modules should be
+   loaded.
+  </para>
+ </warning>

Maybe I'm just old and jaded, but do we really need this? I know we
have the same thing for background workers, but if anything that seems
like an argument against duplicating it elsewhere. Lots of copies of
essentially identical warnings aren't the way to great documentation;
if we copy this here, we'll probably copy it to more places. And also,
it seems a bit like warning people that they shouldn't give their
complete financial records to total strangers about whom they have no
little or no information. Do tell.

+<programlisting>
+typedef bool (*ArchiveCheckConfiguredCB) (void);
+</programlisting>
+
+    If <literal>true</literal> is returned, the server will proceed with
+    archiving the file by calling the <function>archive_file_cb</function>
+    callback.  If <literal>false</literal> is returned, archiving will not
+    proceed.  In the latter case, the server will periodically call this
+    function, and archiving will proceed if it eventually returns
+    <literal>true</literal>.

It's not obvious from reading this why anyone would want to provide
this callback, or have it do anything other than 'return true'. But
there actually is a behavior difference if you provide this and have
it return false, vs. just having archiving itself fail. At least, the
message "archive_mode enabled, yet archiving is not configured" will
be emitted. So that's something we could mention here.

I would suggest s/if it eventually/only when it/

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: archive modules

From
Nathan Bossart
Date:
Thanks for the review!

On Wed, Feb 02, 2022 at 01:42:55PM -0500, Robert Haas wrote:
> I think avoiding ERROR is going to be impractical. Catching it in the
> contrib module seems OK. Catching it in the generic code is probably
> also possible to do in a reasonable way. Not catching the error also
> seems like it would be OK, since we expect errors to be infrequent.
> I'm not objecting to anything you did here, but I'm uncertain why
> adding basic_archive along shell_archive changes the calculus here in
> any way. It just seems like a separate problem.

The main scenario I'm thinking about is when there is no space left for
archives.  The shell archiving logic is pretty good about avoiding ERRORs,
so when there is a problem executing the command, the archiver will retry
the command a few times before giving up for a while.  If basic_archive
just ERROR'd due to ENOSPC, it would cause the archiver to restart.  Until
space frees up, I believe the archiver will end up restarting every 10
seconds.

I thought some more about adding a generic exception handler for the
archiving callback.  I think we'd need to add a new callback function that
would perform any required cleanup (e.g., closing any files that might be
left open).  That part isn't too bad.  However, module authors will also
need to keep in mind that the archiving callback runs in its own transient
memory context.  If the module needs to palloc() something that needs to
stick around for a while, it will need to do so in a different memory
context.  With sufficient documentation, maybe this part isn't too bad
either, but in the end, all of this is to save an optional ~15 lines of
code in the module.  It's not crucial to do your own ERROR handling in your
archive module, but if you want to, you can use basic_archive as a good
starting point.

tl;dr - I left it the same.

> +       /* Perform logging of memory contexts of this process */
> +       if (LogMemoryContextPending)
> +               ProcessLogMemoryContextInterrupt();
> 
> Any special reason for moving this up higher? Not really an issue, just curious.

Since archive_library changes cause the archiver to restart, I thought it
might be good to move this before the process might exit in case
LogMemoryContextPending and ConfigReloadPending are both true.

> +                       gettext_noop("This is unused if
> \"archive_library\" does not indicate archiving via shell is
> enabled.")
> 
> This contains a double negative. We could describe it more positively:
> This is used only if \"archive_library\" specifies archiving via
> shell. But that's actually a little confusing, because the way you've
> set it up, archiving via shell can be specified by writing either
> archive_library = '' or archive_library = 'shell'. I don't see any
> particularly good reason to allow that to be spelled in two ways.
> Let's pick one. Then here we can write either:
> 
> (a) This is used only if archive_library = 'shell'.
> -or-
> (b) This is used only if archive_library is not set.
> 
> IMHO, either of those would be clearer than what you have right now,
> and it would definitely be shorter.

I went with (b).  That felt a bit more natural to me, and it was easier to
code because I don't have to add error checking for an empty string.

> +/*
> + * Callback that gets called to determine if the archive module is
> + * configured.
> + */
> +typedef bool (*ArchiveCheckConfiguredCB) (void);
> +
> +/*
> + * Callback called to archive a single WAL file.
> + */
> +typedef bool (*ArchiveFileCB) (const char *file, const char *path);
> +
> +/*
> + * Called to shutdown an archive module.
> + */
> +typedef void (*ArchiveShutdownCB) (void);
> 
> I think that this is the wrong amount of comments. One theory is that
> the reader should refer to the documentation to understand how these
> callbacks work. In that case, having a separate comment for each one
> that doesn't really say anything is just taking up space. It would be
> better to have one comment for all three lines referring the reader to
> the documentation. Alternatively, one could take the position that the
> explanation should go into these comments, and then perhaps we don't
> even really need documentation. A one-line comment that doesn't really
> say anything non-obvious seems like the worst amount.

In my quest to write well-commented code, I sometimes overdo it.  I
adjusted these comments in the new revision.

> + <warning>
> +  <para>
> +   There are considerable robustness and security risks in using
> archive modules
> +   because, being written in the <literal>C</literal> language, they
> have access
> +   to many server resources.  Administrators wishing to enable archive modules
> +   should exercise extreme caution.  Only carefully audited modules should be
> +   loaded.
> +  </para>
> + </warning>
> 
> Maybe I'm just old and jaded, but do we really need this? I know we
> have the same thing for background workers, but if anything that seems
> like an argument against duplicating it elsewhere. Lots of copies of
> essentially identical warnings aren't the way to great documentation;
> if we copy this here, we'll probably copy it to more places. And also,
> it seems a bit like warning people that they shouldn't give their
> complete financial records to total strangers about whom they have no
> little or no information. Do tell.

I removed this in the new revision.

> +<programlisting>
> +typedef bool (*ArchiveCheckConfiguredCB) (void);
> +</programlisting>
> +
> +    If <literal>true</literal> is returned, the server will proceed with
> +    archiving the file by calling the <function>archive_file_cb</function>
> +    callback.  If <literal>false</literal> is returned, archiving will not
> +    proceed.  In the latter case, the server will periodically call this
> +    function, and archiving will proceed if it eventually returns
> +    <literal>true</literal>.
> 
> It's not obvious from reading this why anyone would want to provide
> this callback, or have it do anything other than 'return true'. But
> there actually is a behavior difference if you provide this and have
> it return false, vs. just having archiving itself fail. At least, the
> message "archive_mode enabled, yet archiving is not configured" will
> be emitted. So that's something we could mention here.

The blurb just above this provides a bit more information, but I tried to
add some additional context in the new revision anyway.

> I would suggest s/if it eventually/only when it/

Done.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: archive modules

From
Robert Haas
Date:
On Wed, Feb 2, 2022 at 5:44 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> > I would suggest s/if it eventually/only when it/
>
> Done.

Committed. I'm going to be 0% surprised if the buildfarm turns pretty
colors, but I don't know how to know what it's going to be unhappy
about except by trying it, so here goes.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: archive modules

From
Nathan Bossart
Date:
On Thu, Feb 03, 2022 at 02:11:18PM -0500, Robert Haas wrote:
> Committed. I'm going to be 0% surprised if the buildfarm turns pretty
> colors, but I don't know how to know what it's going to be unhappy
> about except by trying it, so here goes.

Thanks!  I'll keep an eye on the buildfarm and will send any new patches
that are needed.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
Robert Haas
Date:
On Thu, Feb 3, 2022 at 2:27 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> On Thu, Feb 03, 2022 at 02:11:18PM -0500, Robert Haas wrote:
> > Committed. I'm going to be 0% surprised if the buildfarm turns pretty
> > colors, but I don't know how to know what it's going to be unhappy
> > about except by trying it, so here goes.
>
> Thanks!  I'll keep an eye on the buildfarm and will send any new patches
> that are needed.

Andres just pointed out to me that thorntail is unhappy:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=thorntail&dt=2022-02-03%2019%3A54%3A42

It says:

==~_~===-=-===~_~==
pgsql.build/contrib/basic_archive/log/postmaster.log
==~_~===-=-===~_~==
2022-02-03 23:17:49.019 MSK [1253623:1] FATAL:  WAL archival cannot be
enabled when wal_level is "minimal"

The notes for the machine say:

UBSan; force_parallel_mode; wal_level=minimal; OS bug breaks truncate()

So apparently we need to either skip this test when wal_level=minimal,
or force a higher wal_level to be used for this particular test. Not
sure what the existing precedents are, if any.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: archive modules

From
Nathan Bossart
Date:
On Thu, Feb 03, 2022 at 04:04:33PM -0500, Robert Haas wrote:
> So apparently we need to either skip this test when wal_level=minimal,
> or force a higher wal_level to be used for this particular test. Not
> sure what the existing precedents are, if any.

The only precedent I've found so far is test_decoding, which sets wal_level
to "logical."  Perhaps we can just set it to "replica" in
basic_archive.conf.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
Robert Haas
Date:
On Thu, Feb 3, 2022 at 4:11 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> On Thu, Feb 03, 2022 at 04:04:33PM -0500, Robert Haas wrote:
> > So apparently we need to either skip this test when wal_level=minimal,
> > or force a higher wal_level to be used for this particular test. Not
> > sure what the existing precedents are, if any.
>
> The only precedent I've found so far is test_decoding, which sets wal_level
> to "logical."  Perhaps we can just set it to "replica" in
> basic_archive.conf.

Yeah, that seems to make sense.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: archive modules

From
Nathan Bossart
Date:
On Thu, Feb 03, 2022 at 04:15:30PM -0500, Robert Haas wrote:
> On Thu, Feb 3, 2022 at 4:11 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>> On Thu, Feb 03, 2022 at 04:04:33PM -0500, Robert Haas wrote:
>> > So apparently we need to either skip this test when wal_level=minimal,
>> > or force a higher wal_level to be used for this particular test. Not
>> > sure what the existing precedents are, if any.
>>
>> The only precedent I've found so far is test_decoding, which sets wal_level
>> to "logical."  Perhaps we can just set it to "replica" in
>> basic_archive.conf.
> 
> Yeah, that seems to make sense.

024_archive_recovery.pl seems to do something similar.  Patch attached.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: archive modules

From
Robert Haas
Date:
On Thu, Feb 3, 2022 at 4:25 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> 024_archive_recovery.pl seems to do something similar.  Patch attached.

Committed. I think this is mostly an issue for pg_regress tests, as
opposed to 024_archive_recovery.pl, which is a TAP test. Maybe I'm
wrong about that, but it looks to me like most TAP tests choose what
they want explicitly, while pg_regress tests tend to inherit the
value.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: archive modules

From
Nathan Bossart
Date:
On Thu, Feb 03, 2022 at 04:45:52PM -0500, Robert Haas wrote:
> On Thu, Feb 3, 2022 at 4:25 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>> 024_archive_recovery.pl seems to do something similar.  Patch attached.
> 
> Committed. I think this is mostly an issue for pg_regress tests, as
> opposed to 024_archive_recovery.pl, which is a TAP test. Maybe I'm
> wrong about that, but it looks to me like most TAP tests choose what
> they want explicitly, while pg_regress tests tend to inherit the
> value.

Thanks!

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
talk to ben
Date:
Hi,

I am not sure why, but I can't find "basic_archive.archive_directory" in pg_settings the same way I would find for example : "pg_stat_statements.max".

[local]:5656 benoit@postgres=# SELECT count(*) FROM pg_settings WHERE name = 'basic_archive.archive_directory';
 count
-------
     0
(1 row)

show can find it if I use the complete name but tab completion can't find the guc:

[local]:5656 benoit@postgres=# show basic_archive.archive_directory;
 basic_archive.archive_directory
---------------------------------
 /home/benoit/tmp/tmp/archives
(1 row)

The archiver is configured with "basic_archive" and is working fine. I use this version of pg:

PostgreSQL 15beta2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20220421 (Red Hat 11.3.1-2), 64-bit

Re: archive modules

From
Nathan Bossart
Date:
On Wed, Jul 06, 2022 at 06:21:24PM +0200, talk to ben wrote:
> I am not sure why, but I can't find "basic_archive.archive_directory" in
> pg_settings the same way I would find for example :
> "pg_stat_statements.max".
> 
> [local]:5656 benoit@postgres=# SELECT count(*) FROM pg_settings WHERE name
> = 'basic_archive.archive_directory';
>  count
> -------
>      0
> (1 row)
> 
> show can find it if I use the complete name but tab completion can't find
> the guc:
> 
> [local]:5656 benoit@postgres=# show basic_archive.archive_directory;
>  basic_archive.archive_directory
> ---------------------------------
>  /home/benoit/tmp/tmp/archives
> (1 row)

I think the reason is that only the archiver process loads the library, so
the GUC isn't registered at startup like you'd normally see with
shared_preload_libraries.  IIUC the server will still create a placeholder
GUC during startup for custom parameters, which is why it shows up for SHOW
commands.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
talk to ben
Date:
I think the reason is that only the archiver process loads the library, so
the GUC isn't registered at startup like you'd normally see with
shared_preload_libraries.  IIUC the server will still create a placeholder
GUC during startup for custom parameters, which is why it shows up for SHOW
commands.

Thanks for the quick answer !
That's a little surprising at first but I understand better now.

Will there be a facility to check archive_library gucs later on ? It might come in handy with more
guc rich archive modules.


Re: archive modules

From
talk to ben
Date:
(Sorry for the spam Nathan)

With the list in CC and additional information :

The modified archive module parameters are visible in pg_file_settings.
They don't show up in \dconfig+, which I understand given the query used by the
meta command, but I find a little confusing from an end user POV.

Re: archive modules

From
Alvaro Herrera
Date:
On 2022-Jul-07, talk to ben wrote:

> The modified archive module parameters are visible in pg_file_settings.
> They don't show up in \dconfig+, which I understand given the query used by
> the meta command, but I find a little confusing from an end user POV.

Well, this does sound unsatisfactory.  I suppose one answer would be to
load the module in all backends, in case the user wants to look at the
value.  But that would be wasteful.  Maybe we should have a warning
about it in the docs -- tell people to LOAD the library if they want to
examine the configuration?

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/



Re: archive modules

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Well, this does sound unsatisfactory.  I suppose one answer would be to
> load the module in all backends, in case the user wants to look at the
> value.  But that would be wasteful.  Maybe we should have a warning
> about it in the docs -- tell people to LOAD the library if they want to
> examine the configuration?

The underlying issue here is that the pg_settings view doesn't show
placeholder GUCs because we lack satisfactory values to put in
most of the columns.  I don't know if revisiting that conclusion
would be appropriate or not.  The purist approach would be to show
NULL for any unknown column, but how many applications would that
break?  And even the "known" values could change unexpectedly when
the module does get loaded, for example if the GUC has units and
the value in the config file is expressed in a non-canonical way.
(To say nothing of what a show hook might do...)

            regards, tom lane



Re: archive modules

From
Nathan Bossart
Date:
On Thu, Jul 07, 2022 at 11:48:20AM -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
>> Well, this does sound unsatisfactory.  I suppose one answer would be to
>> load the module in all backends, in case the user wants to look at the
>> value.  But that would be wasteful.  Maybe we should have a warning
>> about it in the docs -- tell people to LOAD the library if they want to
>> examine the configuration?

Yeah, for something like basic_archive, it should be fine to load it via
shared_preload_libraries or LOAD as well as archive_library, but not all
archive libraries might be written to handle that correctly.  And this is
not the most user-friendly.

> The underlying issue here is that the pg_settings view doesn't show
> placeholder GUCs because we lack satisfactory values to put in
> most of the columns.  I don't know if revisiting that conclusion
> would be appropriate or not.  The purist approach would be to show
> NULL for any unknown column, but how many applications would that
> break?  And even the "known" values could change unexpectedly when
> the module does get loaded, for example if the GUC has units and
> the value in the config file is expressed in a non-canonical way.
> (To say nothing of what a show hook might do...)

Perhaps the "category" could indicate that this is a placeholder value, and
the pg_settings documentation could explain exactly what that means (i.e.,
unknown to any libraries loaded in the current process, but may have
meaning to others).

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
talk to ben
Date:
Would this addition to the documentation be ok ? I hope the english is not too broken ..
Attachment

Re: archive modules

From
Nathan Bossart
Date:
On Tue, Aug 23, 2022 at 04:18:52PM +0200, talk to ben wrote:
> --- a/doc/src/sgml/basic-archive.sgml
> +++ b/doc/src/sgml/basic-archive.sgml
> @@ -68,6 +68,19 @@ basic_archive.archive_directory = '/path/to/archive/directory'
>     to any archiving still in progress, but users should use extra caution when
>     doing so.
>    </para>
> +
> +  <para>
> +   The archive module is loaded by the archiver process. Therefore, the
> +   parameters defined in the module are not set outside this process and cannot
> +   be seen from the <structname>pg_settings</structname> view or the
> +   \dconfig meta-command.
> +   These parameters values can be shown from the server's configuration
> +   file(s) through the <structname>pg_file_settings</structname> view.
> +   If you want to check the actual values applied by the archiver, you can
> +   <command>LOAD</command> the module before reading
> +   <structname>pg_settings</structname>. It's also possible to search
> +   for  the options directly with the <command>SHOW</command> command.
> +  </para>

I don't know if it makes sense to document this in basic_archive.  On one
hand, it seems like folks will commonly encounter this behavior with this
module, so this feels like a natural place for such a note.  But on the
other hand, this is generic behavior for any library that is dynamically
loaded in a separate process.

Overall, I think I'm +1 for this patch.  I haven't thought too much about
the exact wording, but provided others support it as well, I will try to
take a deeper look soon.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
Tom Lane
Date:
Nathan Bossart <nathandbossart@gmail.com> writes:
> I don't know if it makes sense to document this in basic_archive.  On one
> hand, it seems like folks will commonly encounter this behavior with this
> module, so this feels like a natural place for such a note.  But on the
> other hand, this is generic behavior for any library that is dynamically
> loaded in a separate process.

Yeah, I don't think this material is at all specific to basic_archive.
Maybe it could be documented near the relevant views, if it isn't already.

Also, I think the proposed text neglects the case of including the
module in shared_preload_libraries.

            regards, tom lane



Re: archive modules

From
talk to ben
Date:
Nathan Bossart <nathandbossart@gmail.com> writes:
> On one hand, it seems like folks will commonly encounter this behavior with this
> module, so this feels like a natural place for such a note. 

Yes, I looked there first.

Would this addition to the pg_settings description be better  ?
Attachment

Re: archive modules

From
Nathan Bossart
Date:
On Wed, Aug 24, 2022 at 10:05:55AM +0200, talk to ben wrote:
>     This view does not display <link linkend="runtime-config-custom">customized options</link>
> -   until the extension module that defines them has been loaded.
> +   until the extension module that defines them has been loaded. Therefore, any
> +   option defined in a library that is dynamically loaded in a separate process
> +   will not be visible in the view, unless the module is manually loaded
> +   beforehand. This case applies for example to an archive module loaded by the
> +   archiver process.

I would suggest something like:

    This view does not display customized options until the extension
    module that defines them has been loaded by the backend process
    executing the query (e.g., via shared_preload_libraries, the LOAD
    command, or a call to a user-defined C function).  For example, since
    the archive_library is only loaded by the archiver process, this view
    will not display any customized options defined by archive modules
    unless special action is taken to load them into the backend process
    executing the query.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
talk to ben
Date:
Here is a patch with the proposed wording.
Attachment

Re: archive modules

From
Nathan Bossart
Date:
On Thu, Aug 25, 2022 at 03:29:41PM +0200, talk to ben wrote:
> Here is a patch with the proposed wording.

Here is the same patch with a couple more links.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: archive modules

From
Nathan Bossart
Date:
On Thu, Aug 25, 2022 at 01:06:00PM -0700, Nathan Bossart wrote:
> On Thu, Aug 25, 2022 at 03:29:41PM +0200, talk to ben wrote:
>> Here is a patch with the proposed wording.
> 
> Here is the same patch with a couple more links.

I would advise that you create a commitfest entry for your patch so that it
isn't forgotten.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
Benoit Lobréau
Date:

On Tue, Aug 30, 2022 at 12:46 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
I would advise that you create a commitfest entry for your patch so that it
isn't forgotten.

Ok done, https://commitfest.postgresql.org/39/3856/ (is that fine if I added you as a reviewer ?)

and thanks for the added links in the patch.

Re: archive modules

From
Nathan Bossart
Date:
On Tue, Aug 30, 2022 at 09:49:20AM +0200, Benoit Lobréau wrote:
> Ok done, https://commitfest.postgresql.org/39/3856/ (is that fine if I
> added you as a reviewer ?)

Of course.  I've marked it as ready-for-committer.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
Tom Lane
Date:
Nathan Bossart <nathandbossart@gmail.com> writes:
> Of course.  I've marked it as ready-for-committer.

Pushed with a bit of additional wordsmithing.

            regards, tom lane



Re: archive modules

From
Nathan Bossart
Date:
On Sat, Sep 10, 2022 at 04:44:16PM -0400, Tom Lane wrote:
> Nathan Bossart <nathandbossart@gmail.com> writes:
>> Of course.  I've marked it as ready-for-committer.
> 
> Pushed with a bit of additional wordsmithing.

Thanks!

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
Peter Eisentraut
Date:
I noticed that this patch has gone around and mostly purged mentions of
archive_command from the documentation and replaced them with
archive_library.  I don't think this is helpful, since people still use
archive_command and will want to see what the documentation has to say
about it.  I suggest we rewind that a bit and for example replace things
like

     removed or recycled until they are archived. If WAL archiving cannot keep up
-   with the pace that WAL is generated, or if <varname>archive_command</varname>
+   with the pace that WAL is generated, or if <varname>archive_library</varname>
     fails repeatedly, old WAL files will accumulate in <filename>pg_wal</filename>

with

     removed or recycled until they are archived. If WAL archiving cannot keep up
     with the pace that WAL is generated, or if <varname>archive_command</varname>
     with the pace that WAL is generated, or if <varname>archive_command</varname>
     or <varname>archive_library</varname>
     fail repeatedly, old WAL files will accumulate in <filename>pg_wal</filename>




Re: archive modules

From
Michael Paquier
Date:
On Wed, Sep 14, 2022 at 06:37:38AM +0200, Peter Eisentraut wrote:
> I noticed that this patch has gone around and mostly purged mentions of
> archive_command from the documentation and replaced them with
> archive_library.  I don't think this is helpful, since people still use
> archive_command and will want to see what the documentation has to say
> about it.  I suggest we rewind that a bit and for example replace things
> like
>
>     removed or recycled until they are archived. If WAL archiving cannot keep up
> -   with the pace that WAL is generated, or if <varname>archive_command</varname>
> +   with the pace that WAL is generated, or if <varname>archive_library</varname>
>     fails repeatedly, old WAL files will accumulate in <filename>pg_wal</filename>
>
> with
>
>     removed or recycled until they are archived. If WAL archiving cannot keep up
>     with the pace that WAL is generated, or if <varname>archive_command</varname>
>     with the pace that WAL is generated, or if <varname>archive_command</varname>
>     or <varname>archive_library</varname>
>     fail repeatedly, old WAL files will accumulate in <filename>pg_wal</filename>

Yep.  Some references to archive_library have been changed by 31e121
to do exactly that.  There seem to be more spots in need of an
update.
--
Michael

Attachment

Re: archive modules

From
Peter Eisentraut
Date:
On 14.09.22 07:25, Michael Paquier wrote:
>>      removed or recycled until they are archived. If WAL archiving cannot keep up
>> -   with the pace that WAL is generated, or if <varname>archive_command</varname>
>> +   with the pace that WAL is generated, or if <varname>archive_library</varname>
>>      fails repeatedly, old WAL files will accumulate in <filename>pg_wal</filename>
>>
>> with
>>
>>      removed or recycled until they are archived. If WAL archiving cannot keep up
>>      with the pace that WAL is generated, or if <varname>archive_command</varname>
>>      with the pace that WAL is generated, or if <varname>archive_command</varname>
>>      or <varname>archive_library</varname>
>>      fail repeatedly, old WAL files will accumulate in <filename>pg_wal</filename>
> 
> Yep.  Some references to archive_library have been changed by 31e121
> to do exactly that.  There seem to be more spots in need of an
> update.

I don't see anything in 31e121 about that.

Here is a patch that addresses this.

Attachment

Re: archive modules

From
Peter Eisentraut
Date:
Another question on this feature: Currently, if archive_library is set, 
archive_command is ignored.  I think if both are set, it should be an 
error.  Compare for example what happens if you set multiple 
recovery_target_xxx settings.  I don't think silently turning off one 
setting by setting another is a good behavior.




Re: archive modules

From
Nathan Bossart
Date:
On Wed, Sep 14, 2022 at 09:33:46PM +0200, Peter Eisentraut wrote:
> Another question on this feature: Currently, if archive_library is set,
> archive_command is ignored.  I think if both are set, it should be an error.
> Compare for example what happens if you set multiple recovery_target_xxx
> settings.  I don't think silently turning off one setting by setting another
> is a good behavior.

I originally did it this way, but changed it based on this feedback [0].  I
have no problem with the general idea, but the recovery_target_* logic does
have the following note:

     * XXX this code is broken by design.  Throwing an error from a GUC assign
     * hook breaks fundamental assumptions of guc.c.  So long as all the variables
     * for which this can happen are PGC_POSTMASTER, the consequences are limited,
     * since we'd just abort postmaster startup anyway.  Nonetheless it's likely
     * that we have odd behaviors such as unexpected GUC ordering dependencies.

[0] https://postgr.es/m/CA%2BTgmoaf4Y7_U%2B_W%2BSg5DoAta_FMssr%3D52mx7-_tJnfaD1VubQ%40mail.gmail.com

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
Peter Eisentraut
Date:
On 14.09.22 22:03, Nathan Bossart wrote:
> On Wed, Sep 14, 2022 at 09:33:46PM +0200, Peter Eisentraut wrote:
>> Another question on this feature: Currently, if archive_library is set,
>> archive_command is ignored.  I think if both are set, it should be an error.
>> Compare for example what happens if you set multiple recovery_target_xxx
>> settings.  I don't think silently turning off one setting by setting another
>> is a good behavior.
> 
> I originally did it this way, but changed it based on this feedback [0].  I
> have no problem with the general idea, but the recovery_target_* logic does
> have the following note:
> 
>      * XXX this code is broken by design.  Throwing an error from a GUC assign
>      * hook breaks fundamental assumptions of guc.c.  So long as all the variables
>      * for which this can happen are PGC_POSTMASTER, the consequences are limited,
>      * since we'd just abort postmaster startup anyway.  Nonetheless it's likely
>      * that we have odd behaviors such as unexpected GUC ordering dependencies.

Ah yes, that won't work.  But maybe we can just check it at run time, 
like in LoadArchiveLibrary().




Re: archive modules

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> On 14.09.22 22:03, Nathan Bossart wrote:
>> I originally did it this way, but changed it based on this feedback [0].  I
>> have no problem with the general idea, but the recovery_target_* logic does
>> have the following note:
>>
>> * XXX this code is broken by design.  Throwing an error from a GUC assign
>> * hook breaks fundamental assumptions of guc.c.  So long as all the variables
>> * for which this can happen are PGC_POSTMASTER, the consequences are limited,
>> * since we'd just abort postmaster startup anyway.  Nonetheless it's likely
>> * that we have odd behaviors such as unexpected GUC ordering dependencies.

> Ah yes, that won't work.  But maybe we can just check it at run time,
> like in LoadArchiveLibrary().

Yeah, the objection there is only to trying to enforce such
interrelationships in GUC hooks.  In this case it seems to me that
we could easily check and complain at the point where we're about
to use the GUC values.

            regards, tom lane



Re: archive modules

From
Nathan Bossart
Date:
On Wed, Sep 14, 2022 at 09:31:04PM +0200, Peter Eisentraut wrote:
> Here is a patch that addresses this.

My intent was to present archive_command as the built-in archive library,
but I can see how this might cause confusion, so this change seems
reasonable to me.

> +   <para>
> +    It is important that the archive command return zero exit status if and
> +    only if it succeeds.  Upon getting a zero result,
> +    <productname>PostgreSQL</productname> will assume that the file has been
> +    successfully archived, and will remove or recycle it.  However, a nonzero
> +    status tells <productname>PostgreSQL</productname> that the file was not archived;
> +    it will try again periodically until it succeeds.
> +   </para>
> +
> +   <para>
> +    When the archive command is terminated by a signal (other than
> +    <systemitem>SIGTERM</systemitem> that is used as part of a server
> +    shutdown) or an error by the shell with an exit status greater than
> +    125 (such as command not found), the archiver process aborts and gets
> +    restarted by the postmaster. In such cases, the failure is
> +    not reported in <xref linkend="pg-stat-archiver-view"/>.
> +   </para>

This wording is very similar to the existing wording in the archive library
section below it.  I think the second paragraph covers the shell command case
explicitly, too.  Perhaps these should be combined.

> +        <varname>archive_mode</varname> and <varname>archive_command</varname> are
> +        separate variables so that <varname>archive_command</varname> can be
> +        changed without leaving archiving mode.

I believe this applies to archive_library, too.

> -   for segments to complete like <xref linkend="guc-archive-library"/> does.
> +   for segments to complete like <xref linkend="guc-archive-command"/> and
> +   <xref linkend="guc-archive-library"/> does.

nitpick: s/does/do

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
Nathan Bossart
Date:
On Wed, Sep 14, 2022 at 04:47:23PM -0400, Tom Lane wrote:
> Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
>> On 14.09.22 22:03, Nathan Bossart wrote:
>>> I originally did it this way, but changed it based on this feedback [0].  I
>>> have no problem with the general idea, but the recovery_target_* logic does
>>> have the following note:
>>> 
>>> * XXX this code is broken by design.  Throwing an error from a GUC assign
>>> * hook breaks fundamental assumptions of guc.c.  So long as all the variables
>>> * for which this can happen are PGC_POSTMASTER, the consequences are limited,
>>> * since we'd just abort postmaster startup anyway.  Nonetheless it's likely
>>> * that we have odd behaviors such as unexpected GUC ordering dependencies.
> 
>> Ah yes, that won't work.  But maybe we can just check it at run time, 
>> like in LoadArchiveLibrary().
> 
> Yeah, the objection there is only to trying to enforce such
> interrelationships in GUC hooks.  In this case it seems to me that
> we could easily check and complain at the point where we're about
> to use the GUC values.

I think the cleanest way to do something like that would be to load a
check_configured_cb that produces a WARNING.  IIRC failing in
LoadArchiveLibrary() would just cause the archiver process to restart over
and over.  HandlePgArchInterrupts() might need some work as well.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
Tom Lane
Date:
Nathan Bossart <nathandbossart@gmail.com> writes:
> On Wed, Sep 14, 2022 at 04:47:23PM -0400, Tom Lane wrote:
>> Yeah, the objection there is only to trying to enforce such
>> interrelationships in GUC hooks.  In this case it seems to me that
>> we could easily check and complain at the point where we're about
>> to use the GUC values.

> I think the cleanest way to do something like that would be to load a
> check_configured_cb that produces a WARNING.  IIRC failing in
> LoadArchiveLibrary() would just cause the archiver process to restart over
> and over.  HandlePgArchInterrupts() might need some work as well.

Hm.  Maybe consistency-check these settings in the postmaster, sometime
after we've absorbed all GUC settings but before we launch any children?
That could provide a saner implementation for the recovery_target_*
variables too.

            regards, tom lane



Re: archive modules

From
Nathan Bossart
Date:
On Wed, Sep 14, 2022 at 06:12:09PM -0400, Tom Lane wrote:
> Nathan Bossart <nathandbossart@gmail.com> writes:
>> On Wed, Sep 14, 2022 at 04:47:23PM -0400, Tom Lane wrote:
>>> Yeah, the objection there is only to trying to enforce such
>>> interrelationships in GUC hooks.  In this case it seems to me that
>>> we could easily check and complain at the point where we're about
>>> to use the GUC values.
> 
>> I think the cleanest way to do something like that would be to load a
>> check_configured_cb that produces a WARNING.  IIRC failing in
>> LoadArchiveLibrary() would just cause the archiver process to restart over
>> and over.  HandlePgArchInterrupts() might need some work as well.
> 
> Hm.  Maybe consistency-check these settings in the postmaster, sometime
> after we've absorbed all GUC settings but before we launch any children?
> That could provide a saner implementation for the recovery_target_*
> variables too.

Both archive_command and archive_library are PGC_SIGHUP, so IIUC that
wouldn't be sufficient.  I attached a quick sketch that seems to provide
the desired behavior.  It's nowhere near committable yet, but it
demonstrates what I'm thinking.

For recovery_target_*, something like you are describing seems reasonable.
I believe PostmasterMain() already performs some similar checks.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: archive modules

From
Peter Eisentraut
Date:
On 14.09.22 23:09, Nathan Bossart wrote:
> On Wed, Sep 14, 2022 at 09:31:04PM +0200, Peter Eisentraut wrote:
>> Here is a patch that addresses this.
> 
> My intent was to present archive_command as the built-in archive library,
> but I can see how this might cause confusion, so this change seems
> reasonable to me.

While working on this, I noticed that in master this conflicts with 
commit 3cabe45a819f8a2a282d9d57e45f259c84e97c3f.  I have posted a 
message in that thread looking for a resolution.




Re: archive modules

From
Peter Eisentraut
Date:
On 17.09.22 11:49, Peter Eisentraut wrote:
> On 14.09.22 23:09, Nathan Bossart wrote:
>> On Wed, Sep 14, 2022 at 09:31:04PM +0200, Peter Eisentraut wrote:
>>> Here is a patch that addresses this.
>>
>> My intent was to present archive_command as the built-in archive library,
>> but I can see how this might cause confusion, so this change seems
>> reasonable to me.
> 
> While working on this, I noticed that in master this conflicts with 
> commit 3cabe45a819f8a2a282d9d57e45f259c84e97c3f.  I have posted a 
> message in that thread looking for a resolution.

I have received clarification there, so I went ahead with this patch 
here after some adjustments in master around that other patch.




Re: archive modules

From
Nathan Bossart
Date:
On Wed, Sep 14, 2022 at 09:33:46PM +0200, Peter Eisentraut wrote:
> Another question on this feature: Currently, if archive_library is set,
> archive_command is ignored.  I think if both are set, it should be an error.
> Compare for example what happens if you set multiple recovery_target_xxx
> settings.  I don't think silently turning off one setting by setting another
> is a good behavior.

Peter, would you like to proceed with something like [0] to resolve this?
If so, I will work on cleaning the patch up.

[0] https://postgr.es/m/20220914222736.GA3042279%40nathanxps13

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
Peter Eisentraut
Date:
On 15.09.22 00:27, Nathan Bossart wrote:
> Both archive_command and archive_library are PGC_SIGHUP, so IIUC that
> wouldn't be sufficient.  I attached a quick sketch that seems to provide
> the desired behavior.  It's nowhere near committable yet, but it
> demonstrates what I'm thinking.

What is the effect of issuing a warning like in this patch?  Would it 
just not archive anything until the configuration is fixed?  I'm not 
sure what behavior you are going for; it's a bit hard to imagine from 
just reading the patch.




Re: archive modules

From
Nathan Bossart
Date:
On Fri, Sep 23, 2022 at 05:58:42AM -0400, Peter Eisentraut wrote:
> On 15.09.22 00:27, Nathan Bossart wrote:
>> Both archive_command and archive_library are PGC_SIGHUP, so IIUC that
>> wouldn't be sufficient.  I attached a quick sketch that seems to provide
>> the desired behavior.  It's nowhere near committable yet, but it
>> demonstrates what I'm thinking.
> 
> What is the effect of issuing a warning like in this patch?  Would it just
> not archive anything until the configuration is fixed?  I'm not sure what
> behavior you are going for; it's a bit hard to imagine from just reading the
> patch.

Yes, it will halt archiving and emit a WARNING, just like what happens on
released versions when you leave archive_command empty.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
Peter Eisentraut
Date:
On 23.09.22 18:14, Nathan Bossart wrote:
> On Fri, Sep 23, 2022 at 05:58:42AM -0400, Peter Eisentraut wrote:
>> On 15.09.22 00:27, Nathan Bossart wrote:
>>> Both archive_command and archive_library are PGC_SIGHUP, so IIUC that
>>> wouldn't be sufficient.  I attached a quick sketch that seems to provide
>>> the desired behavior.  It's nowhere near committable yet, but it
>>> demonstrates what I'm thinking.
>>
>> What is the effect of issuing a warning like in this patch?  Would it just
>> not archive anything until the configuration is fixed?  I'm not sure what
>> behavior you are going for; it's a bit hard to imagine from just reading the
>> patch.
> 
> Yes, it will halt archiving and emit a WARNING, just like what happens on
> released versions when you leave archive_command empty.

Leaving archive_command empty is an intentional configuration choice.

What we are talking about here is, arguably, a misconfiguration, so it 
should result in an error.




Re: archive modules

From
Nathan Bossart
Date:
On Wed, Oct 05, 2022 at 07:55:58PM +0200, Peter Eisentraut wrote:
> Leaving archive_command empty is an intentional configuration choice.
> 
> What we are talking about here is, arguably, a misconfiguration, so it
> should result in an error.

Okay.  What do you think about something like the attached?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: archive modules

From
Peter Eisentraut
Date:
On 05.10.22 20:57, Nathan Bossart wrote:
> On Wed, Oct 05, 2022 at 07:55:58PM +0200, Peter Eisentraut wrote:
>> Leaving archive_command empty is an intentional configuration choice.
>>
>> What we are talking about here is, arguably, a misconfiguration, so it
>> should result in an error.
> 
> Okay.  What do you think about something like the attached?

That looks like the right solution to me.

Let's put that into PG 16, and maybe we can consider backpatching it.




Re: archive modules

From
Bharath Rupireddy
Date:
On Mon, Oct 10, 2022 at 1:17 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 05.10.22 20:57, Nathan Bossart wrote:
> >> What we are talking about here is, arguably, a misconfiguration, so it
> >> should result in an error.
> >
> > Okay.  What do you think about something like the attached?

The intent here looks reasonable to me. However, why should the user
be able to set both archive_command and archive_library in the first
place only to later fail in LoadArchiveLibrary() per the patch? IMO,
the check_hook() is the right way to disallow any sorts of GUC
misconfigurations, no?

FWIW, I'm attaching a small patch that uses check_hook().

> That looks like the right solution to me.
>
> Let's put that into PG 16, and maybe we can consider backpatching it.

+1 to backpatch to PG 15 where the archive modules feature was introduced.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: archive modules

From
Nathan Bossart
Date:
On Thu, Oct 13, 2022 at 03:25:27PM +0530, Bharath Rupireddy wrote:
> The intent here looks reasonable to me. However, why should the user
> be able to set both archive_command and archive_library in the first
> place only to later fail in LoadArchiveLibrary() per the patch? IMO,
> the check_hook() is the right way to disallow any sorts of GUC
> misconfigurations, no?

There was some discussion upthread about using the GUC hooks to enforce
this [0].  In general, it doesn't seem to be a recommended practice.  One
basic example of the problems with this approach is the following:

  1. Set archive_command and leave archive_library unset and restart
     the server.
  2. Unset archive_command and set archive_library and call 'pg_ctl
     reload'.

After these steps, you'll see the following log messages:

    2022-10-13 10:58:42.112 PDT [1562524] LOG:  received SIGHUP, reloading configuration files
    2022-10-13 10:58:42.114 PDT [1562524] LOG:  cannot set "archive_library" when "archive_command" is specified
    2022-10-13 10:58:42.114 PDT [1562524] DETAIL:  Only one of "archive_library" or "archive_command" can be
specified.
    2022-10-13 10:58:42.114 PDT [1562524] LOG:  parameter "archive_command" changed to ""
    2022-10-13 10:58:42.114 PDT [1562524] LOG:  configuration file "/home/nathan/pgdata/postgresql.conf" contains
errors;unaffected changes were applied
 

[0] https://postgr.es/m/20220914200305.GA2984249%40nathanxps13

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
Tom Lane
Date:
Nathan Bossart <nathandbossart@gmail.com> writes:
> On Thu, Oct 13, 2022 at 03:25:27PM +0530, Bharath Rupireddy wrote:
>> The intent here looks reasonable to me. However, why should the user
>> be able to set both archive_command and archive_library in the first
>> place only to later fail in LoadArchiveLibrary() per the patch? IMO,
>> the check_hook() is the right way to disallow any sorts of GUC
>> misconfigurations, no?

> There was some discussion upthread about using the GUC hooks to enforce
> this [0].  In general, it doesn't seem to be a recommended practice.

Yeah, it really does not work to use GUC hooks to enforce multi-variable
constraints.  We've learned that the hard way (more than once, if memory
serves).

            regards, tom lane



Re: archive modules

From
Michael Paquier
Date:
On Thu, Oct 13, 2022 at 02:53:38PM -0400, Tom Lane wrote:
> Yeah, it really does not work to use GUC hooks to enforce multi-variable
> constraints.  We've learned that the hard way (more than once, if memory
> serves).

414c2fd is one of the most recent ones.  Its thread is about the same
thing.
--
Michael

Attachment

Re: archive modules

From
Bharath Rupireddy
Date:
On Fri, Oct 14, 2022 at 6:00 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, Oct 13, 2022 at 02:53:38PM -0400, Tom Lane wrote:
> > Yeah, it really does not work to use GUC hooks to enforce multi-variable
> > constraints.  We've learned that the hard way (more than once, if memory
> > serves).
>
> 414c2fd is one of the most recent ones.  Its thread is about the same
> thing.

Got it. Thanks. Just thinking if we must move below comment somewhere
to guc related files?

 * XXX this code is broken by design.  Throwing an error from a GUC assign
 * hook breaks fundamental assumptions of guc.c.  So long as all the variables
 * for which this can happen are PGC_POSTMASTER, the consequences are limited,
 * since we'd just abort postmaster startup anyway.  Nonetheless it's likely
 * that we have odd behaviors such as unexpected GUC ordering dependencies.
 */

FWIW, I see check_stage_log_stats() and check_log_stats() that set
errdetail and return false causing the similar error:

postgres=# alter system set log_statement_stats = true;
postgres=# select pg_reload_conf();
postgres=# alter system set log_statement_stats = false;
postgres=# alter system set log_parser_stats = true;
ERROR:  invalid value for parameter "log_parser_stats": 1
DETAIL:  Cannot enable parameter when "log_statement_stats" is true.

On Thu, Oct 13, 2022 at 11:54 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:
>
> On Thu, Oct 13, 2022 at 03:25:27PM +0530, Bharath Rupireddy wrote:
> > The intent here looks reasonable to me. However, why should the user
> > be able to set both archive_command and archive_library in the first
> > place only to later fail in LoadArchiveLibrary() per the patch? IMO,
> > the check_hook() is the right way to disallow any sorts of GUC
> > misconfigurations, no?
>
> There was some discussion upthread about using the GUC hooks to enforce
> this [0].  In general, it doesn't seem to be a recommended practice.  One
> basic example of the problems with this approach is the following:
>
>   1. Set archive_command and leave archive_library unset and restart
>      the server.
>   2. Unset archive_command and set archive_library and call 'pg_ctl
>      reload'.

Thanks. And yes, if GUC 'foo' is reset but not reloaded and the
check_hook() in the GUC 'bar' while setting it uses the old value of
'foo' and fails.

I'm re-attaching Nathan's patch as-is from [1] here again, just to
make CF bot test the correct patch. Few comments on that patch:

1)
+    if (XLogArchiveLibrary[0] != '\0' && XLogArchiveCommand[0] != '\0')
+        ereport(ERROR,
+                (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                 errmsg("both archive_command and archive_library specified"),
+                 errdetail("Only one of archive_command,
archive_library may be set.")));

The above errmsg looks informational. Can we just say something like
below?  It doesn't require errdetail as the errmsg says it all. See
the other instances elsewhere [2].

        ereport(ERROR,
                (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                 errmsg("cannot specify both \"archive_command\" and
\"archive_library\"")));

2) I think we have a problem - set archive_mode and archive_library
and start the server, then set archive_command, reload the conf, see
[3] - the archiver needs to error out right? The archiver gets
restarted whenever archive_library changes but not when
archive_command changes. I think the right place for the error is
after or at the end of HandlePgArchInterrupts().

[1] https://www.postgresql.org/message-id/20221005185716.GB201192%40nathanxps13
[2] errmsg("cannot specify both PARSER and COPY options")));
errmsg("cannot specify both %s and %s",
errmsg("cannot specify both %s and %s",
[3]
./psql -c "alter system set archive_mode='on'" postgres
./psql -c "alter system set
archive_library='/home/ubuntu/postgres/contrib/basic_archive/basic_archive.so'"
postgres
./pg_ctl -D data -l logfile restart
./psql -c "alter system set archive_command='cp %p
/home/ubuntu/archived_wal/%f'" postgres
./psql -c "select pg_reload_conf();" postgres
postgres=# show archive_mode;
 archive_mode
--------------
 on
(1 row)
postgres=# show archive_command ;
          archive_command
------------------------------------
 cp %p /home/ubuntu/archived_wal/%f
(1 row)
postgres=# show archive_library ;
                       archive_library
--------------------------------------------------------------
 /home/ubuntu/postgres/contrib/basic_archive/basic_archive.so
(1 row)
postgres=# select pid, wait_event_type, backend_type from
pg_stat_activity where backend_type = 'archiver';
   pid   | wait_event_type | backend_type
---------+-----------------+--------------
 2116760 | Activity        | archiver
(1 row)

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: archive modules

From
Nathan Bossart
Date:
On Fri, Oct 14, 2022 at 12:10:18PM +0530, Bharath Rupireddy wrote:
> +        ereport(ERROR,
> +                (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> +                 errmsg("both archive_command and archive_library specified"),
> +                 errdetail("Only one of archive_command,
> archive_library may be set.")));
> 
> The above errmsg looks informational. Can we just say something like
> below?  It doesn't require errdetail as the errmsg says it all. See
> the other instances elsewhere [2].
> 
>         ereport(ERROR,
>                 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>                  errmsg("cannot specify both \"archive_command\" and
> \"archive_library\"")));

I modeled this after the ERROR that error_multiple_recovery_targets()
emits.  I don't think there's really any material difference between your
proposal and mine, but I don't have a strong opinion.

> 2) I think we have a problem - set archive_mode and archive_library
> and start the server, then set archive_command, reload the conf, see
> [3] - the archiver needs to error out right? The archiver gets
> restarted whenever archive_library changes but not when
> archive_command changes. I think the right place for the error is
> after or at the end of HandlePgArchInterrupts().

Good catch.  You are right, this is broken.  I believe that we need to
check for the misconfiguration in HandlePgArchInterrupts() in addition to
LoadArchiveLibrary().  I will work on fixing this.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
Nathan Bossart
Date:
On Fri, Oct 14, 2022 at 11:51:30AM -0700, Nathan Bossart wrote:
> On Fri, Oct 14, 2022 at 12:10:18PM +0530, Bharath Rupireddy wrote:
>> 2) I think we have a problem - set archive_mode and archive_library
>> and start the server, then set archive_command, reload the conf, see
>> [3] - the archiver needs to error out right? The archiver gets
>> restarted whenever archive_library changes but not when
>> archive_command changes. I think the right place for the error is
>> after or at the end of HandlePgArchInterrupts().
> 
> Good catch.  You are right, this is broken.  I believe that we need to
> check for the misconfiguration in HandlePgArchInterrupts() in addition to
> LoadArchiveLibrary().  I will work on fixing this.

As promised...

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: archive modules

From
Bharath Rupireddy
Date:
On Sat, Oct 15, 2022 at 3:13 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Fri, Oct 14, 2022 at 11:51:30AM -0700, Nathan Bossart wrote:
> > On Fri, Oct 14, 2022 at 12:10:18PM +0530, Bharath Rupireddy wrote:
> >> 2) I think we have a problem - set archive_mode and archive_library
> >> and start the server, then set archive_command, reload the conf, see
> >> [3] - the archiver needs to error out right? The archiver gets
> >> restarted whenever archive_library changes but not when
> >> archive_command changes. I think the right place for the error is
> >> after or at the end of HandlePgArchInterrupts().
> >
> > Good catch.  You are right, this is broken.  I believe that we need to
> > check for the misconfiguration in HandlePgArchInterrupts() in addition to
> > LoadArchiveLibrary().  I will work on fixing this.
>
> As promised...

Thanks. I think that if the condition can be simplified something like
in the attached. It's okay to call shutdown callback twice by getting
rid of the comment [1] as it doesn't add any extra value or
information, it just says that we're calling shutdown callback
function. With the attached, the code is more readable and the
footprint of the changes are reduced.

[1]
            /*
             * Call the currently loaded archive module's shutdown callback,
             * if one is defined.
             */
            call_archive_module_shutdown_callback(0, 0);

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

Re: archive modules

From
Kyotaro Horiguchi
Date:
At Fri, 14 Oct 2022 14:42:56 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in 
> As promised...

As the code written, when archive library is being added while archive
command is already set, archiver first emits seemingly positive
message "restarting archive process because of..", then errors out
after the resatart and keep restarting with complaining for the wrong
setting. I think we don't need the first message.

The ERROR always turns into FATAL, so FATAL would less confusing here,
maybe.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: archive modules

From
Michael Paquier
Date:
On Mon, Oct 17, 2022 at 01:46:39PM +0900, Kyotaro Horiguchi wrote:
> As the code written, when archive library is being added while archive
> command is already set, archiver first emits seemingly positive
> message "restarting archive process because of..", then errors out
> after the resatart and keep restarting with complaining for the wrong
> setting. I think we don't need the first message.
>
> The ERROR always turns into FATAL, so FATAL would less confusing here,
> maybe.

You mean the second message in HandlePgArchInterrupts() when
archiveLibChanged is false?  An ERROR or a FATAL would not change much
as there is a proc_exit() anyway down the road.

+   if (XLogArchiveLibrary[0] != '\0' && XLogArchiveCommand[0] != '\0')
+       ereport(ERROR,
+               (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                errmsg("both archive_command and archive_library specified"),
+                errdetail("Only one of archive_command, archive_library may be set.")));

So, backpedalling from upthread where Peter mentions that we should
complain if both archive_command and archive_library are set (creating
a parallel with recovery parameters), I'd like to think that pgarch.c
should have zero knowledge of what an archive_command is and should
just handle the library part.  This makes the whole reasoning around
what pgarch.c should be much simpler, aka it just needs to know about
archive *libraries*, not *commands*.  That's the kind of business that
check_configured_cb() is designed for, actually, as far as I
understand, or this callback could just be removed entirely for the
same effect, as there would be no point in having pgarch.c do its
thing without archive_library or archive_command where a WARNING is
issued in the default case (shell_archive with no archive_command).

And, by the way, this patch would prevent the existence of archive
modules that need to be loaded but *want* an archive_command with
what they want to achieve.  That does not strike me as a good idea if
we want to have a maximum of flexibility with this facility.  I think
that for all that, we should put the responsability of what should be
set or not set directly to the modules, aka basic_archive could
complain if archive_command is set, but that does not strike me as a
mandatory requirement, either.  It is true that archive_library has
been introduced as a way to avoid using archive_command, but the point
of creating a stronger dependency between both would be IMO annoying
in the long-term.
--
Michael

Attachment

Re: archive modules

From
Bharath Rupireddy
Date:
On Mon, Oct 17, 2022 at 11:20 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Oct 17, 2022 at 01:46:39PM +0900, Kyotaro Horiguchi wrote:
> > As the code written, when archive library is being added while archive
> > command is already set, archiver first emits seemingly positive
> > message "restarting archive process because of..", then errors out
> > after the resatart and keep restarting with complaining for the wrong
> > setting. I think we don't need the first message.
> >
> > The ERROR always turns into FATAL, so FATAL would less confusing here,
> > maybe.
>
> You mean the second message in HandlePgArchInterrupts() when
> archiveLibChanged is false?  An ERROR or a FATAL would not change much
> as there is a proc_exit() anyway down the road.

Yes, ERROR or FATAL it really doesn't matter, the process exits see,
pg_re_throw(), for archiver PG_exception_stack is null.
2022-10-18 09:57:41.869 UTC [2479104] FATAL:  both archive_command and
archive_library specified
2022-10-18 09:57:41.869 UTC [2479104] DETAIL:  Only one of
archive_command, archive_library may be set.

I think Kyotaro-san's concern is to place errmsg("both archive_command
and archive_library specified"), before errmsg("restarting archiver
process because value of \"archive_library\" was changed", something
like the attached v4 patch.

> +   if (XLogArchiveLibrary[0] != '\0' && XLogArchiveCommand[0] != '\0')
> +       ereport(ERROR,
> +               (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> +                errmsg("both archive_command and archive_library specified"),
> +                errdetail("Only one of archive_command, archive_library may be set.")));
>
> So, backpedalling from upthread where Peter mentions that we should
> complain if both archive_command and archive_library are set (creating
> a parallel with recovery parameters), I'd like to think that pgarch.c
> should have zero knowledge of what an archive_command is and should
> just handle the library part.  This makes the whole reasoning around
> what pgarch.c should be much simpler, aka it just needs to know about
> archive *libraries*, not *commands*.

Are you saying that we make/treat/build shell_archive.c as a separate
shared library/module (instead of just an object file) and load it in
pgarc.c? If yes, this can make pgarch.c simple.

> That's the kind of business that
> check_configured_cb() is designed for, actually, as far as I
> understand, or this callback could just be removed entirely for the
> same effect, as there would be no point in having pgarch.c do its
> thing without archive_library or archive_command where a WARNING is
> issued in the default case (shell_archive with no archive_command).

If it's done as said above, the corresponding check_configured_cb()
can deal with allowing/disallowing/misconfiguring various parameters.

> And, by the way, this patch would prevent the existence of archive
> modules that need to be loaded but *want* an archive_command with
> what they want to achieve.  That does not strike me as a good idea if
> we want to have a maximum of flexibility with this facility.  I think
> that for all that, we should put the responsability of what should be
> set or not set directly to the modules, aka basic_archive could
> complain if archive_command is set, but that does not strike me as a
> mandatory requirement, either.  It is true that archive_library has
> been introduced as a way to avoid using archive_command, but the point
> of creating a stronger dependency between both would be IMO annoying
> in the long-term.

Great thought! If the responsibility of
allowing/disallowing/misconfiguring various parameters is given to
check_configured_cb(), the modules can decide whether to error out or
deal with it or use it.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
Ian Lawrence Barwick
Date:
2022年10月16日(日) 16:36 Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>:
>
> On Sat, Oct 15, 2022 at 3:13 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
> >
> > On Fri, Oct 14, 2022 at 11:51:30AM -0700, Nathan Bossart wrote:
> > > On Fri, Oct 14, 2022 at 12:10:18PM +0530, Bharath Rupireddy wrote:
> > >> 2) I think we have a problem - set archive_mode and archive_library
> > >> and start the server, then set archive_command, reload the conf, see
> > >> [3] - the archiver needs to error out right? The archiver gets
> > >> restarted whenever archive_library changes but not when
> > >> archive_command changes. I think the right place for the error is
> > >> after or at the end of HandlePgArchInterrupts().
> > >
> > > Good catch.  You are right, this is broken.  I believe that we need to
> > > check for the misconfiguration in HandlePgArchInterrupts() in addition to
> > > LoadArchiveLibrary().  I will work on fixing this.
> >
> > As promised...
>
> Thanks. I think that if the condition can be simplified something like
> in the attached. It's okay to call shutdown callback twice by getting
> rid of the comment [1] as it doesn't add any extra value or
> information, it just says that we're calling shutdown callback
> function. With the attached, the code is more readable and the
> footprint of the changes are reduced.
>
> [1]
>             /*
>              * Call the currently loaded archive module's shutdown callback,
>              * if one is defined.
>              */
>             call_archive_module_shutdown_callback(0, 0);

Hi

cfbot reports the patch no longer applies [1].  As CommitFest 2022-11 is
currently underway, this would be an excellent time to update the patch.

[1] http://cfbot.cputube.org/patch_40_3933.log

Thanks

Ian Barwick



Re: archive modules

From
Nathan Bossart
Date:
On Fri, Nov 04, 2022 at 12:05:26PM +0900, Ian Lawrence Barwick wrote:
> cfbot reports the patch no longer applies [1].  As CommitFest 2022-11 is
> currently underway, this would be an excellent time to update the patch.

Indeed.  Here is a new version of the patch.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: archive modules

From
Nathan Bossart
Date:
On Mon, Oct 17, 2022 at 02:49:51PM +0900, Michael Paquier wrote:
> And, by the way, this patch would prevent the existence of archive
> modules that need to be loaded but *want* an archive_command with
> what they want to achieve.  That does not strike me as a good idea if
> we want to have a maximum of flexibility with this facility.

Such a module could define a custom GUC that accepts a shell command.  I
don't think we should overload the meaning of archive_command based on the
whims of whatever archive module is loaded.  Besides the potential end-user
confusion, your archive_command might be unexpectedly used incorrectly if
you forget to set archive_library.

Perhaps we could eventually move the archive_command functionality to a
contrib module (i.e., "shell_archive") so that users must always set
archive_library.  But until then, I suspect it's better to treat modules
and commands as two separate interfaces to ease migration from older major
versions (even though archive_command is now essentially a built-in archive
module).

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
Michael Paquier
Date:
On Sat, Nov 05, 2022 at 02:08:58PM -0700, Nathan Bossart wrote:
> Such a module could define a custom GUC that accepts a shell command.  I
> don't think we should overload the meaning of archive_command based on the
> whims of whatever archive module is loaded.  Besides the potential end-user
> confusion, your archive_command might be unexpectedly used incorrectly if
> you forget to set archive_library.

While mostly copying the logic from shell_archive.c to build the
command to execute (aka shell_archive_file), which is not great as
well.  But well, perhaps my whole line of argument is just moot..

> Perhaps we could eventually move the archive_command functionality to a
> contrib module (i.e., "shell_archive") so that users must always set
> archive_library.  But until then, I suspect it's better to treat modules
> and commands as two separate interfaces to ease migration from older major
> versions (even though archive_command is now essentially a built-in archive
> module).

I agree that this is a fine long-term goal, removing all traces of the
archive_command from the backend core code.  This is actually an
argument in favor of having no traces of XLogArchiveCommand in
pgarch.c, no? ;p

I am not sure how long we should wait before being able to do that,
perhaps a couple of years of least?  I'd like to think the sooner the
better (like v17?) but we are usually conservative, and the removal of
the exclusive backup mode took 5~6 years if I recall correctly..
--
Michael

Attachment

Re: archive modules

From
Nathan Bossart
Date:
On Mon, Nov 07, 2022 at 03:20:31PM +0900, Michael Paquier wrote:
> On Sat, Nov 05, 2022 at 02:08:58PM -0700, Nathan Bossart wrote:
>> Perhaps we could eventually move the archive_command functionality to a
>> contrib module (i.e., "shell_archive") so that users must always set
>> archive_library.  But until then, I suspect it's better to treat modules
>> and commands as two separate interfaces to ease migration from older major
>> versions (even though archive_command is now essentially a built-in archive
>> module).
> 
> I agree that this is a fine long-term goal, removing all traces of the
> archive_command from the backend core code.  This is actually an
> argument in favor of having no traces of XLogArchiveCommand in
> pgarch.c, no? ;p

Indeed.

> I am not sure how long we should wait before being able to do that,
> perhaps a couple of years of least?  I'd like to think the sooner the
> better (like v17?) but we are usually conservative, and the removal of
> the exclusive backup mode took 5~6 years if I recall correctly..

Yeah, I imagine we'd need to mark it as deprecated-and-to-be-removed for
several years first.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
Peter Eisentraut
Date:
On 05.11.22 21:51, Nathan Bossart wrote:
> On Fri, Nov 04, 2022 at 12:05:26PM +0900, Ian Lawrence Barwick wrote:
>> cfbot reports the patch no longer applies [1].  As CommitFest 2022-11 is
>> currently underway, this would be an excellent time to update the patch.
> 
> Indeed.  Here is a new version of the patch.

I have committed this to master.

The surrounding code has changed a bit between PG15 and master, so if we 
wanted to backpatch this, we'd need another patch from you.  However, at 
this point, I'm content to just leave it be in PG15.




Re: archive modules

From
Nathan Bossart
Date:
On Tue, Nov 15, 2022 at 10:31:44AM +0100, Peter Eisentraut wrote:
> I have committed this to master.

Thanks!

> The surrounding code has changed a bit between PG15 and master, so if we
> wanted to backpatch this, we'd need another patch from you.  However, at
> this point, I'm content to just leave it be in PG15.

Sounds good to me.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
Alvaro Herrera
Date:
On 2022-Nov-15, Nathan Bossart wrote:

> On Tue, Nov 15, 2022 at 10:31:44AM +0100, Peter Eisentraut wrote:

> > The surrounding code has changed a bit between PG15 and master, so if we
> > wanted to backpatch this, we'd need another patch from you.  However, at
> > this point, I'm content to just leave it be in PG15.
> 
> Sounds good to me.

Hmm, really?  It seems to me that we will have two slightly different
behaviors in 15 and master, which may be confusing later on.  I think
it'd be better to make them both work identically.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Industry suffers from the managerial dogma that for the sake of stability
and continuity, the company should be independent of the competence of
individual employees."                                      (E. Dijkstra)



Re: archive modules

From
Nathan Bossart
Date:
On Tue, Nov 15, 2022 at 06:14:25PM +0100, Alvaro Herrera wrote:
> On 2022-Nov-15, Nathan Bossart wrote:
> 
>> On Tue, Nov 15, 2022 at 10:31:44AM +0100, Peter Eisentraut wrote:
> 
>> > The surrounding code has changed a bit between PG15 and master, so if we
>> > wanted to backpatch this, we'd need another patch from you.  However, at
>> > this point, I'm content to just leave it be in PG15.
>> 
>> Sounds good to me.
> 
> Hmm, really?  It seems to me that we will have two slightly different
> behaviors in 15 and master, which may be confusing later on.  I think
> it'd be better to make them both work identically.

I don't have a strong opinion either way.  While consistency between v15
and master seems nice, the behavior change might not be appropriate for a
minor release.  BTW I was able to cherry-pick the committed patch to v15
without any changes.  Peter, could you clarify what changes you'd like to
see in a back-patched version?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: archive modules

From
Michael Paquier
Date:
On Tue, Nov 15, 2022 at 12:57:49PM -0800, Nathan Bossart wrote:
> On Tue, Nov 15, 2022 at 06:14:25PM +0100, Alvaro Herrera wrote:
>> Hmm, really?  It seems to me that we will have two slightly different
>> behaviors in 15 and master, which may be confusing later on.  I think
>> it'd be better to make them both work identically.
>
> I don't have a strong opinion either way.  While consistency between v15
> and master seems nice, the behavior change might not be appropriate for a
> minor release.  BTW I was able to cherry-pick the committed patch to v15
> without any changes.  Peter, could you clarify what changes you'd like to
> see in a back-patched version?

FWIW, I am not sure that I would have done d627ce3 as I already
mentioned upthread as the library loading should not be related to
archive_command.  If there is support more support in doing that, I am
fine to withdraw, but the behavior between HEAD and REL_15_STABLE
ought to be consistent.
--
Michael

Attachment