On 05.03.2020 09:24, Michael Paquier wrote:
> On Wed, Mar 04, 2020 at 08:14:20PM +0300, Alexey Kondratov wrote:
>>> - I did not actually get why you don't check for a missing command
>>> when using wait_result_is_any_signal. In this case I'd think that it
>>> is better to exit immediately as follow-up calls would just fail.
>> Believe me or not, but I put 'false' there intentionally. The idea was that
>> if the reason is a signal, then maybe user tired of waiting and killed that
>> restore_command process theirself or something like that, so it is better to
>> exit immediately. If it was a missing command, then there is no hurry, so we
>> can go further and complain that attempt of recovering WAL segment has
>> failed.
>>
>> Actually, I guess that there is no big difference if we include missing
>> command here or not. There is no complicated logic further compared to real
>> recovery process in Postgres, where we cannot simply return false in that
>> case.
> On the contrary, it seems to me that the difference is very important.
> Imagine for example a frontend tool which calls RestoreArchivedWALFile
> in a loop, and that this one fails because the command called is
> missing. This tool would keep looping for nothing. So checking for a
> missing command and leaving immediately would be more helpful for the
> user. Can you think about scenarios where it would make sense to be
> able to loop in this case instead of failing?
OK, I was still having in mind pg_rewind as the only one user of this
routine. Now it is a part of the common and I could imagine a
hypothetical tool that is polling the archive and waiting for a specific
WAL segment to become available. In this case 'command not found' is
definitely the end of game, while the absence of segment is expected
error, so we can continue looping.
Regards
--
Alexey Kondratov
Postgres Professional https://www.postgrespro.com
Russian Postgres Company