restore-command error handling - Mailing list pgsql-general

From Sebastiaan Mannem
Subject restore-command error handling
Date
Msg-id 7f385d14be564e5d8fecf9a3c201c8b5@mannem.nl
Whole thread Raw
List pgsql-general

Hi,


this should probably be for pgsql-hackers, but https://www.postgresql.org/list/ mentioned 'You must try elsewhere first!', and this list was second best...


I wanted to point you to this github issue:

https://github.com/wal-g/wal-g/issues/1126


Basically, Postgres only knows of 3 types of return codes:

0: No problem, next WAL file...

1 - 125: End of timeline? Ok, lets stop recovery and go online

>=126: Ouch, big problem. Better not proceed, but error out with a FAIL instead


Looking at https://tldp.org/LDP/abs/html/exitcodes.html exit codes beyond 125 is all OS related.

Like 'Permission problem or command is not an executable', or 'Control-C is fatal error signal 2'.


I would assume that exit code 78 would be a better choice to distinguish errors for the restore_command which are not os-related, but still would be better ending in 'Ouch, big problem. Better not proceed, but error out with a FAIL instead'.


I think I will work on a fix for wal-g to better distinguish in exit codes, but since all I currently can do is exit with a code >= 126, I wanted to bring this to the postgres community too.

Furthermore, this is beyond wal-g, basically for everything that runs as a restore_command...

Would you consider another exit code to the list so that restore_commands don't need to exit with error codes that where meant to signal OS-level issues?


I wanted to end with this quote from the second link I pointed to:

Ending a script with exit 127 would certainly cause confusion when troubleshooting (is the error code a "command not found" or a user-defined one?).

However, many scripts use an exit 1 as a general bailout-upon-error. 

Since exit code 1 signifies so many possible errors, it is not particularly useful in debugging.

Which to me is not just for 127, but for all exit codes beyond 125...


Thanks.

pgsql-general by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL
Next
From: Lucas
Date:
Subject: Wal files in /pgsql/14/main/pg_wal not removed