Thread: pg_ctl status with nonexistent data directory
This doesn't seem right: $ pg_ctl -D /nowhere status pg_ctl: no server running It does exit with status 3, so it's not all that broken, but I think the error message could be more accurate.
On Sat, Nov 2, 2013 at 3:32 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > This doesn't seem right: > > $ pg_ctl -D /nowhere status > pg_ctl: no server running > > It does exit with status 3, so it's not all that broken, but I think the > error message could be more accurate. I doubt anyone will object if you feel the urge to fix it. I won't, anyway. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Nov 4, 2013 at 11:31:30AM -0500, Robert Haas wrote: > On Sat, Nov 2, 2013 at 3:32 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > > This doesn't seem right: > > > > $ pg_ctl -D /nowhere status > > pg_ctl: no server running > > > > It does exit with status 3, so it's not all that broken, but I think the > > error message could be more accurate. > > I doubt anyone will object if you feel the urge to fix it. I won't, anyway. I have addressed this issue with the attached patch: $ pg_ctl -D /lkjasdf status pg_ctl: directory "/lkjasdf" does not exist $ pg_ctl -D / status pg_ctl: directory "/" is not a database cluster directory One odd question is that pg_ctl status has this comment for reporting the exit code for non-running servers: printf(_("%s: no server running\n"), progname); /* * The Linux Standard Base Core Specification 3.1 says this should return * '3' * https://refspecs.linuxbase.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html */ exit(3); If they haven't passed us a data directory, we don't really know if the server is running or not, so the patch just returns '1'. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Attachment
On Thu, Mar 6, 2014 at 4:38 AM, Bruce Momjian <bruce@momjian.us> wrote: > On Mon, Nov 4, 2013 at 11:31:30AM -0500, Robert Haas wrote: >> On Sat, Nov 2, 2013 at 3:32 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >> > This doesn't seem right: >> > >> > $ pg_ctl -D /nowhere status >> > pg_ctl: no server running >> > >> > It does exit with status 3, so it's not all that broken, but I think the >> > error message could be more accurate. >> >> I doubt anyone will object if you feel the urge to fix it. I won't, anyway. > > I have addressed this issue with the attached patch: > > $ pg_ctl -D /lkjasdf status > pg_ctl: directory "/lkjasdf" does not exist > $ pg_ctl -D / status > pg_ctl: directory "/" is not a database cluster directory > > One odd question is that pg_ctl status has this comment for reporting > the exit code for non-running servers: > > printf(_("%s: no server running\n"), progname); > > /* > * The Linux Standard Base Core Specification 3.1 says this should return > * '3' > * https://refspecs.linuxbase.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html > */ > exit(3); > > If they haven't passed us a data directory, we don't really know if the > server is running or not, so the patch just returns '1'. But for such cases, isn't the status 4 more appropriate? As per above link "4 program or service status is unknown" status 1 - "1 program is dead and /var/run pid file exists" Going by this definition, it seems status 1 means, someone has forcefully killed the server and pid file still remains. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Thu, Mar 6, 2014 at 09:54:57AM +0530, Amit Kapila wrote: > > If they haven't passed us a data directory, we don't really know if the > > server is running or not, so the patch just returns '1'. > > But for such cases, isn't the status 4 more appropriate? > As per above link "4 program or service status is unknown" > > status 1 - "1 program is dead and /var/run pid file exists" > Going by this definition, it seems status 1 means, someone > has forcefully killed the server and pid file still remains. Technically, you are right, but I tried a while ago to assign meaningful values to all the exit locations and the community feedback I got was that we didn't want that. I don't see how specifying non-existant or non-cluster directory would somehow be a case that would be special. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Mar6, 2014, at 00:08 , Bruce Momjian <bruce@momjian.us> wrote: > I have addressed this issue with the attached patch: > > $ pg_ctl -D /lkjasdf status > pg_ctl: directory "/lkjasdf" does not exist > $ pg_ctl -D / status > pg_ctl: directory "/" is not a database cluster directory > > One odd question is that pg_ctl status has this comment for reporting > the exit code for non-running servers: > > printf(_("%s: no server running\n"), progname); > > /* > * The Linux Standard Base Core Specification 3.1 says this should return > * '3' > * https://refspecs.linuxbase.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html > */ > exit(3); > > If they haven't passed us a data directory, we don't really know if the > server is running or not, so the patch just returns '1'. Why change the exit code at all in the ENOENT-case? If the directory does not exist, it's fairly certain that the server is not running, so "3" seems fine. Technically, changing the return value is an API change and might break things, so why do it if there's no clear benefit? In the EPERM case (or, rather the non-ENOENT case), I agree with Amit that "4" (meaning "program or service status is unknown") fits much better than "1" (meaning "program is dead and /var/run pid file exists"). So *if* we change it at all, we should change it to 4, not to some other, equally arbitrary value. best regards, Florian Pflug
Bruce Momjian escribió: > Technically, you are right, but I tried a while ago to assign meaningful > values to all the exit locations and the community feedback I got was > that we didn't want that. That sounds odd. Do you have a link? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Bruce Momjian escribi�: >> Technically, you are right, but I tried a while ago to assign meaningful >> values to all the exit locations and the community feedback I got was >> that we didn't want that. > That sounds odd. Do you have a link? FWIW, I recall that in my former life at Red Hat, I had to deal several times with patching pg_ctl to make its exit codes more LSB-compliant. And I think Debian does the same. So Linux packagers, at least, would like to see us pay more attention to that standard. On the other hand, there is no market for changes that make the exit codes "more meaningful" unless the changes can be justified by chapter and verse in LSB. regards, tom lane
On Thu, Mar 6, 2014 at 12:17:55PM -0300, Alvaro Herrera wrote: > Bruce Momjian escribió: > > > Technically, you are right, but I tried a while ago to assign meaningful > > values to all the exit locations and the community feedback I got was > > that we didn't want that. > > That sounds odd. Do you have a link? Sure, the patch is here: http://www.postgresql.org/message-id/20130629025033.GI13790@momjian.us and the idea of keeping what we have is stated here: http://www.postgresql.org/message-id/51D1E482.5090602@gmx.net -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Thu, Mar 6, 2014 at 7:46 PM, Bruce Momjian <bruce@momjian.us> wrote: > On Thu, Mar 6, 2014 at 09:54:57AM +0530, Amit Kapila wrote: >> > If they haven't passed us a data directory, we don't really know if the >> > server is running or not, so the patch just returns '1'. >> >> But for such cases, isn't the status 4 more appropriate? >> As per above link "4 program or service status is unknown" >> >> status 1 - "1 program is dead and /var/run pid file exists" >> Going by this definition, it seems status 1 means, someone >> has forcefully killed the server and pid file still remains. > > Technically, you are right, but I tried a while ago to assign meaningful > values to all the exit locations and the community feedback I got was > that we didn't want that. I don't see how specifying non-existant or > non-cluster directory would somehow be a case that would be special. One reason could be that as we are already returning special exit code for 'status' option of pg_ctl (exit(3)), this seems to be inline with it as this also get called during status command. Also in the link sent by you in another mail, I could see some statement which indicates it is more important to return correct codes in case of status which sounds a good reason to me. Link and statement http://www.postgresql.org/message-id/51D1E482.5090602@gmx.net "The "status" case is different, because there the exit code can be passed out by the init script directly." If we just want to handle correct exit codes for "status" option, then may be we need to refactor the code a bit to ensure the same. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Bruce Momjian <bruce@momjian.us> writes: > On Thu, Mar 6, 2014 at 12:17:55PM -0300, Alvaro Herrera wrote: >> Bruce Momjian escribi�: >>> Technically, you are right, but I tried a while ago to assign meaningful >>> values to all the exit locations and the community feedback I got was >>> that we didn't want that. >> That sounds odd. Do you have a link? > Sure, the patch is here: > http://www.postgresql.org/message-id/20130629025033.GI13790@momjian.us > and the idea of keeping what we have is stated here: > http://www.postgresql.org/message-id/51D1E482.5090602@gmx.net Perhaps I shouldn't be putting words in Peter's mouth, but my reading of his complaint was that he didn't think you'd mapped the pg_ctl failure conditions to LSB status codes very well. That's not necessarily a vote against the abstract idea of making it more LSB-compliant. But it seems like we might have to go through it case-by-case to argue out what's the right error code for each case ... and I'm not sure anybody thinks it's worth that much effort. regards, tom lane
On Thu, Mar 6, 2014 at 10:43:01PM -0500, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > On Thu, Mar 6, 2014 at 12:17:55PM -0300, Alvaro Herrera wrote: > >> Bruce Momjian escribi�: > >>> Technically, you are right, but I tried a while ago to assign meaningful > >>> values to all the exit locations and the community feedback I got was > >>> that we didn't want that. > > >> That sounds odd. Do you have a link? > > > Sure, the patch is here: > > http://www.postgresql.org/message-id/20130629025033.GI13790@momjian.us > > and the idea of keeping what we have is stated here: > > http://www.postgresql.org/message-id/51D1E482.5090602@gmx.net > > Perhaps I shouldn't be putting words in Peter's mouth, but my reading of > his complaint was that he didn't think you'd mapped the pg_ctl failure > conditions to LSB status codes very well. That's not necessarily a vote > against the abstract idea of making it more LSB-compliant. > > But it seems like we might have to go through it case-by-case to argue out > what's the right error code for each case ... and I'm not sure anybody > thinks it's worth that much effort. Yes, I think the question was whether the effort was worth it. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Fri, Mar 7, 2014 at 09:07:24AM +0530, Amit Kapila wrote: > One reason could be that as we are already returning special exit code > for 'status' option of pg_ctl (exit(3)), this seems to be inline with it as this > also get called during status command. > > Also in the link sent by you in another mail, I could see some statement > which indicates it is more important to return correct codes in case of > status which sounds a good reason to me. > > Link and statement > http://www.postgresql.org/message-id/51D1E482.5090602@gmx.net > > "The "status" case is different, because there the exit code > can be passed out by the init script directly." > > If we just want to handle correct exit codes for "status" option, then > may be we need to refactor the code a bit to ensure the same. OK, done with the attached patch Three is returned for status, but one for other cases. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Attachment
On Fri, Mar 7, 2014 at 9:41 AM, Bruce Momjian <bruce@momjian.us> wrote: > On Fri, Mar 7, 2014 at 09:07:24AM +0530, Amit Kapila wrote: >> One reason could be that as we are already returning special exit code >> for 'status' option of pg_ctl (exit(3)), this seems to be inline with it as this >> also get called during status command. >> >> Also in the link sent by you in another mail, I could see some statement >> which indicates it is more important to return correct codes in case of >> status which sounds a good reason to me. >> >> Link and statement >> http://www.postgresql.org/message-id/51D1E482.5090602@gmx.net >> >> "The "status" case is different, because there the exit code >> can be passed out by the init script directly." >> >> If we just want to handle correct exit codes for "status" option, then >> may be we need to refactor the code a bit to ensure the same. > > OK, done with the attached patch Three is returned for status, but one > for other cases. I think it would have been better if it return status as 4 for cases where directory/file is not accessible (current new cases added by this patch). I think status 3 should be only return only when the data directory is valid and pid file is missing, because in script after getting this code, the next thing they will probably do is to start the server which doesn't seem a good fit for newly added cases. Other than that patch seems fine to me. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Fri, Mar 7, 2014 at 07:18:04PM +0530, Amit Kapila wrote: > > OK, done with the attached patch Three is returned for status, but one > > for other cases. > > I think it would have been better if it return status as 4 for cases where > directory/file is not accessible (current new cases added by this patch). > > I think status 3 should be only return only when the data directory is valid > and pid file is missing, because in script after getting this code, the next > thing they will probably do is to start the server which doesn't seem a > good fit for newly added cases. OK, I have updated the attached patch to reflect this, and added a C comment. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Attachment
On Fri, Mar 7, 2014 at 7:59 PM, Bruce Momjian <bruce@momjian.us> wrote: > On Fri, Mar 7, 2014 at 07:18:04PM +0530, Amit Kapila wrote: >> > OK, done with the attached patch Three is returned for status, but one >> > for other cases. >> >> I think it would have been better if it return status as 4 for cases where >> directory/file is not accessible (current new cases added by this patch). >> >> I think status 3 should be only return only when the data directory is valid >> and pid file is missing, because in script after getting this code, the next >> thing they will probably do is to start the server which doesn't seem a >> good fit for newly added cases. > > OK, I have updated the attached patch to reflect this, and added a C > comment. This is fine, do you think we should mention about this in docs? I have added one line in docs (updated patch attached), if you think it makes sense then add it otherwise ignore it. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Attachment
On Sat, Mar 8, 2014 at 09:08:31AM +0530, Amit Kapila wrote: > On Fri, Mar 7, 2014 at 7:59 PM, Bruce Momjian <bruce@momjian.us> wrote: > > On Fri, Mar 7, 2014 at 07:18:04PM +0530, Amit Kapila wrote: > >> > OK, done with the attached patch Three is returned for status, but one > >> > for other cases. > >> > >> I think it would have been better if it return status as 4 for cases where > >> directory/file is not accessible (current new cases added by this patch). > >> > >> I think status 3 should be only return only when the data directory is valid > >> and pid file is missing, because in script after getting this code, the next > >> thing they will probably do is to start the server which doesn't seem a > >> good fit for newly added cases. > > > > OK, I have updated the attached patch to reflect this, and added a C > > comment. > > This is fine, do you think we should mention about this in docs? > I have added one line in docs (updated patch attached), if you think > it makes sense then add it otherwise ignore it. I have applied your version of the patch with slightly different text: If an accessible data directory is not specified, the process returns anexit status of 4. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +