Thread: pg_ctl and port number detection

pg_ctl and port number detection

From
Bruce Momjian
Date:
pg_ctl.c::test_postmaster_connection() has some fragile code that tries
to detect the server port number by looking in the pg_ctl -o string,
postgresql.conf, the PGPORT environment variable, and finally using the
default port number.

I think a simpler solution would be to look in postmaster.pid:
10231/u/pgsql/data  5432001  45481984

pg_ctl already knows the data directory.  If the file is missing, the
server is not running.  If the file exists, the first number on the last
line, divided by 1000, is the port number.  We can then use this port
number for libpq to check for connectivity.

Comments?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: pg_ctl and port number detection

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> pg_ctl.c::test_postmaster_connection() has some fragile code that tries
> to detect the server port number by looking in the pg_ctl -o string,

It may be fragile, but it works; or at least I've not heard complaints
about it lately.

> I think a simpler solution would be to look in postmaster.pid:
> pg_ctl already knows the data directory.  If the file is missing, the
> server is not running.  If the file exists, the first number on the last
> line, divided by 1000, is the port number.

That's somewhere between fragile and outright wrong.
        regards, tom lane


Re: pg_ctl and port number detection

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > pg_ctl.c::test_postmaster_connection() has some fragile code that tries
> > to detect the server port number by looking in the pg_ctl -o string,
> 
> It may be fragile, but it works; or at least I've not heard complaints
> about it lately.

True.

> > I think a simpler solution would be to look in postmaster.pid:
> > pg_ctl already knows the data directory.  If the file is missing, the
> > server is not running.  If the file exists, the first number on the last
> > line, divided by 1000, is the port number.
> 
> That's somewhere between fragile and outright wrong.

Please explain why my idea is not an improvement.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: pg_ctl and port number detection

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Tom Lane wrote:
>> Bruce Momjian <bruce@momjian.us> writes:
>>> pg_ctl already knows the data directory.  If the file is missing, the
>>> server is not running.  If the file exists, the first number on the last
>>> line, divided by 1000, is the port number.

>> That's somewhere between fragile and outright wrong.

> Please explain why my idea is not an improvement.

Because it's assuming that those numbers are sysv shmem keys derived in
a particular way.  We have platforms on which that is wrong, Windows
being the most obvious example.  Reading the shmem key assignment code
closely will suggest to you other ways that this could fail.  Not to
mention that people propose getting rid of sysv shmem approximately
every other month, and perhaps someday that will actually happen;
whereupon whatever might get logged in postmaster.pid could be something
completely different.

If you really think that pulling a port number out of the pid file is an
improvement over what pg_ctl does now, then you need to start by storing
the port number, as such, in the pid file.  Not something that might or
might not be related to the port number.  But what we have to discuss
before that is whether we mind having a significant postmaster version
dependency in pg_ctl.
        regards, tom lane


Re: pg_ctl and port number detection

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Tom Lane wrote:
> >> Bruce Momjian <bruce@momjian.us> writes:
> >>> pg_ctl already knows the data directory.  If the file is missing, the
> >>> server is not running.  If the file exists, the first number on the last
> >>> line, divided by 1000, is the port number.
> 
> >> That's somewhere between fragile and outright wrong.
> 
> > Please explain why my idea is not an improvement.
> 
> Because it's assuming that those numbers are sysv shmem keys derived in
> a particular way.  We have platforms on which that is wrong, Windows
> being the most obvious example.  Reading the shmem key assignment code
> closely will suggest to you other ways that this could fail.  Not to
> mention that people propose getting rid of sysv shmem approximately
> every other month, and perhaps someday that will actually happen;
> whereupon whatever might get logged in postmaster.pid could be something
> completely different.

Yeah, I was afraid of Windows.

> If you really think that pulling a port number out of the pid file is an
> improvement over what pg_ctl does now, then you need to start by storing
> the port number, as such, in the pid file.  Not something that might or
> might not be related to the port number.  But what we have to discuss
> before that is whether we mind having a significant postmaster version
> dependency in pg_ctl.

OK, good point on the version issue.  Let's see if we get more
complaints before changing this.  Thanks.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: pg_ctl and port number detection

From
Andrew Dunstan
Date:

On 12/18/2010 06:23 PM, Bruce Momjian wrote:
>
>> If you really think that pulling a port number out of the pid file is an
>> improvement over what pg_ctl does now, then you need to start by storing
>> the port number, as such, in the pid file.  Not something that might or
>> might not be related to the port number.  But what we have to discuss
>> before that is whether we mind having a significant postmaster version
>> dependency in pg_ctl.
> OK, good point on the version issue.  Let's see if we get more
> complaints before changing this.  Thanks.
>

Wasn't there a proposal to provide an explicit port parameter to pg_ctl, 
instead of relying on PGPORT? That would probably be a small advance.

cheers

andrew


Re: pg_ctl and port number detection

From
Bruce Momjian
Date:
Andrew Dunstan wrote:
> 
> 
> On 12/18/2010 06:23 PM, Bruce Momjian wrote:
> >
> >> If you really think that pulling a port number out of the pid file is an
> >> improvement over what pg_ctl does now, then you need to start by storing
> >> the port number, as such, in the pid file.  Not something that might or
> >> might not be related to the port number.  But what we have to discuss
> >> before that is whether we mind having a significant postmaster version
> >> dependency in pg_ctl.
> > OK, good point on the version issue.  Let's see if we get more
> > complaints before changing this.  Thanks.
> >
> 
> Wasn't there a proposal to provide an explicit port parameter to pg_ctl, 
> instead of relying on PGPORT? That would probably be a small advance.

I do not remember that suggestion.

I wonder if we should write the port number as the 4th line in
postmaster.pid and return in a few major releases and use that.  We
could fall back and use our existing code if there is no 4th line.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: pg_ctl and port number detection

From
Florian Pflug
Date:
On Dec19, 2010, at 00:54 , Bruce Momjian wrote:
> I wonder if we should write the port number as the 4th line in
> postmaster.pid and return in a few major releases and use that.  We
> could fall back and use our existing code if there is no 4th line.

What if the postmaster instead created a second unix socket in its
data directory? For security reason, it'd probably need to set
the permissions to 0600, but it'd still allow maintenance tools to
connect reliably if they only knew the data directory.

Don't know if that'd work on windows, though - I have no idea if
we even support something similar to unix sockets there, and if so,
it it lives in the filesystem.

best regards,
Florian Pflug


Re: pg_ctl and port number detection

From
Magnus Hagander
Date:
On Sun, Dec 19, 2010 at 20:16, Florian Pflug <fgp@phlo.org> wrote:
> On Dec19, 2010, at 00:54 , Bruce Momjian wrote:
>> I wonder if we should write the port number as the 4th line in
>> postmaster.pid and return in a few major releases and use that.  We
>> could fall back and use our existing code if there is no 4th line.
>
> What if the postmaster instead created a second unix socket in its
> data directory? For security reason, it'd probably need to set
> the permissions to 0600, but it'd still allow maintenance tools to
> connect reliably if they only knew the data directory.
>
> Don't know if that'd work on windows, though - I have no idea if
> we even support something similar to unix sockets there, and if so,
> it it lives in the filesystem.

We don't, and AFAIK there's nothing that lives in the filesystem. You
have named pipes that live in the namespace, but not within
directories in the filesystem.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: pg_ctl and port number detection

From
Florian Pflug
Date:
On Dec19, 2010, at 21:10 , Magnus Hagander wrote:
> On Sun, Dec 19, 2010 at 20:16, Florian Pflug <fgp@phlo.org> wrote:
>> On Dec19, 2010, at 00:54 , Bruce Momjian wrote:
>>> I wonder if we should write the port number as the 4th line in
>>> postmaster.pid and return in a few major releases and use that.  We
>>> could fall back and use our existing code if there is no 4th line.
>> 
>> What if the postmaster instead created a second unix socket in its
>> data directory? For security reason, it'd probably need to set
>> the permissions to 0600, but it'd still allow maintenance tools to
>> connect reliably if they only knew the data directory.
>> 
>> Don't know if that'd work on windows, though - I have no idea if
>> we even support something similar to unix sockets there, and if so,
>> it it lives in the filesystem.
> 
> We don't, and AFAIK there's nothing that lives in the filesystem. You
> have named pipes that live in the namespace, but not within
> directories in the filesystem.


Hm, OK, that pretty much kills the idea. Having to special-case
windows seems less appealing than the other options.

best regards,
Florian Pflug



Re: pg_ctl and port number detection

From
Bruce Momjian
Date:
Bruce Momjian wrote:
> Andrew Dunstan wrote:
> > 
> > 
> > On 12/18/2010 06:23 PM, Bruce Momjian wrote:
> > >
> > >> If you really think that pulling a port number out of the pid file is an
> > >> improvement over what pg_ctl does now, then you need to start by storing
> > >> the port number, as such, in the pid file.  Not something that might or
> > >> might not be related to the port number.  But what we have to discuss
> > >> before that is whether we mind having a significant postmaster version
> > >> dependency in pg_ctl.
> > > OK, good point on the version issue.  Let's see if we get more
> > > complaints before changing this.  Thanks.
> > >
> > 
> > Wasn't there a proposal to provide an explicit port parameter to pg_ctl, 
> > instead of relying on PGPORT? That would probably be a small advance.
> 
> I do not remember that suggestion.
> 
> I wonder if we should write the port number as the 4th line in
> postmaster.pid and return in a few major releases and use that.  We
> could fall back and use our existing code if there is no 4th line.

OK, here is a modified idea.  For 9.1, we have the postmaster write the
port number as the fourth line in postmaster.pid.  pg_ctl will use that
fourth line if it exists, i.e. the postmaster is 9.1+.

If the fourth line is missing, we use the first number of the third line
on Unix and divide that by 1000 to get the port number.  That file
format is not going to change because it is pre-9.1.  If the third line
is empty (e.g. Windows) we either use PGPORT or throw an error.

So, we have a fine solution for 9.1+ servers, and all Unix servers, and
if you want to use a 9.1+ pg_ctl on a pre-9.1 server on Windows and use
the -w flag and a non-standard port number, you must specify PGPORT. 
Based on the fact that most Windows users use the one-click installer
that will use the matching pg_ctl version, it seems this will work just
fine.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: pg_ctl and port number detection

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
>> I wonder if we should write the port number as the 4th line in
>> postmaster.pid and return in a few major releases and use that.  We
>> could fall back and use our existing code if there is no 4th line.

No.  If it goes in, it should go in as the third line.  The shmem key
data is private to the server --- we do not want external programs
assuming anything at all about the private part of postmaster.pid.
        regards, tom lane


Re: pg_ctl and port number detection

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> >> I wonder if we should write the port number as the 4th line in
> >> postmaster.pid and return in a few major releases and use that.  We
> >> could fall back and use our existing code if there is no 4th line.
> 
> No.  If it goes in, it should go in as the third line.  The shmem key
> data is private to the server --- we do not want external programs
> assuming anything at all about the private part of postmaster.pid.

OK, so you are suggesting having it as a third value on the third line?
10231/u/pgsql/data  5432001  45481984 port_here                    ^^^^^^^^^
I like that better because it simplifies the test and limits the
possibility of non-atomic multi-line writes.  For Win32, we would just
have the port number because the line is normally empty.                     
--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: pg_ctl and port number detection

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Tom Lane wrote:
>> No.  If it goes in, it should go in as the third line.  The shmem key
>> data is private to the server --- we do not want external programs
>> assuming anything at all about the private part of postmaster.pid.

> OK, so you are suggesting having it as a third value on the third line?

>     10231
>     /u/pgsql/data
>       5432001  45481984 port_here
>                         ^^^^^^^^^

I'm not sure why you're having such a hard time grasping this concept.
We do not want pg_ctl looking at the shmem key information, not even to
the extent of assuming a particular format for it.  Therefore the port
number has to go before it not after it.  What I'm thinking of is
piddatadirport... here be dragons ...

Actually, if we're going to do this at all, we should do
piddatadirportsocketdir... here be dragons ...

so that pg_ctl doesn't have to assume the server is running with a
default value of unix_socket_dir.  Not sure what to put in the fourth
line on Windows though ... maybe just leave it empty?
        regards, tom lane


Re: pg_ctl and port number detection

From
Andrew Dunstan
Date:

On 12/20/2010 12:41 PM, Tom Lane wrote:
>
> Actually, if we're going to do this at all, we should do
>
>     pid
>     datadir
>     port
>     socketdir
>     ... here be dragons ...
>
> so that pg_ctl doesn't have to assume the server is running with a
> default value of unix_socket_dir.  Not sure what to put in the fourth
> line on Windows though ... maybe just leave it empty?
>
>             

Yes, that seems reasonable.

cheers

andrew


Re: pg_ctl and port number detection

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Tom Lane wrote:
> >> No.  If it goes in, it should go in as the third line.  The shmem key
> >> data is private to the server --- we do not want external programs
> >> assuming anything at all about the private part of postmaster.pid.
> 
> > OK, so you are suggesting having it as a third value on the third line?
> 
> >     10231
> >     /u/pgsql/data
> >       5432001  45481984 port_here
> >                         ^^^^^^^^^
> 
> I'm not sure why you're having such a hard time grasping this concept.
> We do not want pg_ctl looking at the shmem key information, not even to
> the extent of assuming a particular format for it.  Therefore the port
> number has to go before it not after it.  What I'm thinking of is
> 
>     pid
>     datadir
>     port
>     ... here be dragons ...
> 
> Actually, if we're going to do this at all, we should do
> 
>     pid
>     datadir
>     port
>     socketdir
>     ... here be dragons ...
> 
> so that pg_ctl doesn't have to assume the server is running with a
> default value of unix_socket_dir.  Not sure what to put in the fourth
> line on Windows though ... maybe just leave it empty?

OK.  I was hesitant to modify the existing postmaster.pid format and was
trying to just add on the end.  It is certainly easier to put it before
the shared memory stuff.  I will work on a patch.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: pg_ctl and port number detection

From
Bruce Momjian
Date:
Tom Lane wrote:
> Actually, if we're going to do this at all, we should do
> 
>     pid
>     datadir
>     port
>     socketdir
>     ... here be dragons ...
> 
> so that pg_ctl doesn't have to assume the server is running with a
> default value of unix_socket_dir.  Not sure what to put in the fourth
> line on Windows though ... maybe just leave it empty?

I am curious about the use of the socketdir.  What can that be used for?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: pg_ctl and port number detection

From
Florian Pflug
Date:
On Dec21, 2010, at 03:04 , Bruce Momjian wrote:
> Tom Lane wrote:
>> Actually, if we're going to do this at all, we should do
>> 
>>     pid
>>     datadir
>>     port
>>     socketdir
>>     ... here be dragons ...
>> 
>> so that pg_ctl doesn't have to assume the server is running with a
>> default value of unix_socket_dir.  Not sure what to put in the fourth
>> line on Windows though ... maybe just leave it empty?
> 
> I am curious about the use of the socketdir.  What can that be used for?

If it's non-default and you want to connect via unix sockets, just knowing
the port won't help much.

best regards,
Florian Pflug



Re: pg_ctl and port number detection

From
Bruce Momjian
Date:
Florian Pflug wrote:
> On Dec21, 2010, at 03:04 , Bruce Momjian wrote:
> > Tom Lane wrote:
> >> Actually, if we're going to do this at all, we should do
> >> 
> >>     pid
> >>     datadir
> >>     port
> >>     socketdir
> >>     ... here be dragons ...
> >> 
> >> so that pg_ctl doesn't have to assume the server is running with a
> >> default value of unix_socket_dir.  Not sure what to put in the fourth
> >> line on Windows though ... maybe just leave it empty?
> > 
> > I am curious about the use of the socketdir.  What can that be used for?
> 
> If it's non-default and you want to connect via unix sockets, just knowing
> the port won't help much.

Ah, so pg_ctl is going to use that information too --- good point!

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: pg_ctl and port number detection

From
Bruce Momjian
Date:
Tom Lane wrote:
> Actually, if we're going to do this at all, we should do
>
>     pid
>     datadir
>     port
>     socketdir
>     ... here be dragons ...
>
> so that pg_ctl doesn't have to assume the server is running with a
> default value of unix_socket_dir.  Not sure what to put in the fourth
> line on Windows though ... maybe just leave it empty?

OK, here is a patch that adds the port number and optionally socket
directory location to postmaster.pid, and modifies pg_ctl to use that
information.  I throw an error on using Win32 with pre-9.1 servers
because we can't get the port number from that file.

This removes some crufty code from pg_ctl and removes dependency on
serveral user-configurable settings that we added as a work-around.

This will allow pg_ctl -w to work more reliabily than it did in the
past.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/ref/pg_ctl-ref.sgml b/doc/src/sgml/ref/pg_ctl-ref.sgml
index 2c01e12..c526e8e 100644
*** /tmp/pgrevert.8079/5dqQCd_pg_ctl-ref.sgml    Wed Dec 22 10:10:23 2010
--- doc/src/sgml/ref/pg_ctl-ref.sgml    Wed Dec 22 10:06:33 2010
*************** PostgreSQL documentation
*** 348,368 ****
         <para>
          Wait for the startup or shutdown to complete.
          Waiting is the default option for shutdowns, but not startups.
          When waiting for shutdown, <command>pg_ctl</command> waits for
          the server to remove its <acronym>PID</acronym> file.
!         When waiting for startup, <command>pg_ctl</command> repeatedly
!         attempts to connect to the server via <application>psql</>, and
!         reports success when this is successful.
!         <command>pg_ctl</command> will attempt to use the proper port for
!         <application>psql</>. If the environment variable
!         <envar>PGPORT</envar> exists, that is used.  Otherwise,
!         <command>pg_ctl</command> will see if a port has been set in the
!         <filename>postgresql.conf</filename> file.  If not, it will use the
!         default port that <productname>PostgreSQL</productname> was compiled
!         with (5432 by default).
!         When waiting, <command>pg_ctl</command> will
!         return an exit code based on the success of the startup
!         or shutdown.
         </para>
        </listitem>
       </varlistentry>
--- 348,359 ----
         <para>
          Wait for the startup or shutdown to complete.
          Waiting is the default option for shutdowns, but not startups.
+         When waiting for startup, <command>pg_ctl</command> repeatedly
+         attempts to connect to the server.
          When waiting for shutdown, <command>pg_ctl</command> waits for
          the server to remove its <acronym>PID</acronym> file.
!         <command>pg_ctl</command> returns an exit code based on the
!         success of the startup or shutdown.
         </para>
        </listitem>
       </varlistentry>
*************** PostgreSQL documentation
*** 442,469 ****
      </listitem>
     </varlistentry>

-    <varlistentry>
-     <term><envar>PGHOST</envar></term>
-
-     <listitem>
-      <para>
-       Default host name or Unix-domain socket location for <xref
-       linkend="app-psql"> (used when waiting for startup).
-      </para>
-     </listitem>
-    </varlistentry>
-
-    <varlistentry>
-     <term><envar>PGPORT</envar></term>
-
-     <listitem>
-      <para>
-       Default port number for <xref linkend="app-psql">
-       (used when waiting for startup).
-      </para>
-     </listitem>
-    </varlistentry>
-
    </variablelist>

    <para>
--- 433,438 ----
*************** PostgreSQL documentation
*** 506,523 ****
      </listitem>
     </varlistentry>

-    <varlistentry>
-     <term><filename>postgresql.conf</filename></term>
-
-     <listitem>
-      <para>
-       This file, located in the data directory, is parsed to find the
-       proper port to use with <application>psql</application>
-       when waiting for startup.
-      </para>
-     </listitem>
-    </varlistentry>
-
    </variablelist>
   </refsect1>

--- 475,480 ----
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index 14ed914..deb2d58 100644
*** /tmp/pgrevert.8079/PNev2b_miscinit.c    Wed Dec 22 10:10:23 2010
--- src/backend/utils/init/miscinit.c    Wed Dec 22 10:06:33 2010
***************
*** 33,38 ****
--- 33,39 ----
  #include "mb/pg_wchar.h"
  #include "miscadmin.h"
  #include "postmaster/autovacuum.h"
+ #include "postmaster/postmaster.h"
  #include "storage/fd.h"
  #include "storage/ipc.h"
  #include "storage/pg_shmem.h"
*************** CreateLockFile(const char *filename, boo
*** 658,664 ****
                 bool isDDLock, const char *refName)
  {
      int            fd;
!     char        buffer[MAXPGPATH + 100];
      int            ntries;
      int            len;
      int            encoded_pid;
--- 659,665 ----
                 bool isDDLock, const char *refName)
  {
      int            fd;
!     char        buffer[MAXPGPATH * 2 + 256];
      int            ntries;
      int            len;
      int            encoded_pid;
*************** CreateLockFile(const char *filename, boo
*** 868,876 ****
      /*
       * Successfully created the file, now fill it.
       */
!     snprintf(buffer, sizeof(buffer), "%d\n%s\n",
               amPostmaster ? (int) my_pid : -((int) my_pid),
!              DataDir);
      errno = 0;
      if (write(fd, buffer, strlen(buffer)) != strlen(buffer))
      {
--- 869,877 ----
      /*
       * Successfully created the file, now fill it.
       */
!     snprintf(buffer, sizeof(buffer), "%d\n%s\n%d\n%s\n",
               amPostmaster ? (int) my_pid : -((int) my_pid),
!              DataDir, PostPortNumber, UnixSocketDir);
      errno = 0;
      if (write(fd, buffer, strlen(buffer)) != strlen(buffer))
      {
*************** RecordSharedMemoryInLockFile(unsigned lo
*** 994,1001 ****
  {
      int            fd;
      int            len;
      char       *ptr;
!     char        buffer[BLCKSZ];

      fd = open(DIRECTORY_LOCK_FILE, O_RDWR | PG_BINARY, 0);
      if (fd < 0)
--- 995,1003 ----
  {
      int            fd;
      int            len;
+     int            lineno;
      char       *ptr;
!     char        buffer[MAXPGPATH * 2 + 256];

      fd = open(DIRECTORY_LOCK_FILE, O_RDWR | PG_BINARY, 0);
      if (fd < 0)
*************** RecordSharedMemoryInLockFile(unsigned lo
*** 1019,1036 ****
      buffer[len] = '\0';

      /*
!      * Skip over first two lines (PID and path).
       */
!     ptr = strchr(buffer, '\n');
!     if (ptr == NULL ||
!         (ptr = strchr(ptr + 1, '\n')) == NULL)
      {
!         elog(LOG, "bogus data in \"%s\"", DIRECTORY_LOCK_FILE);
!         close(fd);
!         return;
      }
!     ptr++;
!
      /*
       * Append key information.    Format to try to keep it the same length
       * always (trailing junk won't hurt, but might confuse humans).
--- 1021,1040 ----
      buffer[len] = '\0';

      /*
!      * Skip over first four lines (PID, pgdata, portnum, socketdir).
       */
!     ptr = buffer;
!     for (lineno = 1; lineno <= 4; lineno++)
      {
!         if ((ptr = strchr(ptr, '\n')) == NULL)
!         {
!             elog(LOG, "bogus data in \"%s\"", DIRECTORY_LOCK_FILE);
!             close(fd);
!             return;
!         }
!         ptr++;
      }
!
      /*
       * Append key information.    Format to try to keep it the same length
       * always (trailing junk won't hurt, but might confuse humans).
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
index c5f855e..b91612a 100644
*** /tmp/pgrevert.8079/9n6jMb_pg_ctl.c    Wed Dec 22 10:10:23 2010
--- src/bin/pg_ctl/pg_ctl.c    Wed Dec 22 10:10:09 2010
*************** static bool postmaster_is_alive(pid_t pi
*** 141,147 ****

  static char postopts_file[MAXPGPATH];
  static char pid_file[MAXPGPATH];
- static char conf_file[MAXPGPATH];
  static char backup_file[MAXPGPATH];
  static char recovery_file[MAXPGPATH];

--- 141,146 ----
*************** start_postmaster(void)
*** 404,516 ****
  static PGPing
  test_postmaster_connection(bool do_checkpoint)
  {
      PGPing        ret = PQPING_OK;    /* assume success for wait == zero */
      int            i;
-     char        portstr[32];
-     char       *p;
-     char       *q;
-     char        connstr[128];    /* Should be way more than enough! */

!     portstr[0] = '\0';
!
!     /*
!      * Look in post_opts for a -p switch.
!      *
!      * This parsing code is not amazingly bright; it could for instance get
!      * fooled if ' -p' occurs within a quoted argument value.  Given that few
!      * people pass complicated settings in post_opts, it's probably good
!      * enough.
!      */
!     for (p = post_opts; *p;)
      {
!         /* advance past whitespace */
!         while (isspace((unsigned char) *p))
!             p++;
!
!         if (strncmp(p, "-p", 2) == 0)
          {
!             p += 2;
!             /* advance past any whitespace/quoting */
!             while (isspace((unsigned char) *p) || *p == '\'' || *p == '"')
!                 p++;
!             /* find end of value (not including any ending quote!) */
!             q = p;
!             while (*q &&
!                    !(isspace((unsigned char) *q) || *q == '\'' || *q == '"'))
!                 q++;
!             /* and save the argument value */
!             strlcpy(portstr, p, Min((q - p) + 1, sizeof(portstr)));
!             /* keep looking, maybe there is another -p */
!             p = q;
!         }
!         /* Advance to next whitespace */
!         while (*p && !isspace((unsigned char) *p))
!             p++;
!     }
!
!     /*
!      * Search config file for a 'port' option.
!      *
!      * This parsing code isn't amazingly bright either, but it should be okay
!      * for valid port settings.
!      */
!     if (!portstr[0])
!     {
!         char      **optlines;

!         optlines = readfile(conf_file);
!         if (optlines != NULL)
!         {
!             for (; *optlines != NULL; optlines++)
!             {
!                 p = *optlines;

!                 while (isspace((unsigned char) *p))
!                     p++;
!                 if (strncmp(p, "port", 4) != 0)
!                     continue;
!                 p += 4;
!                 while (isspace((unsigned char) *p))
!                     p++;
!                 if (*p != '=')
!                     continue;
!                 p++;
!                 /* advance past any whitespace/quoting */
!                 while (isspace((unsigned char) *p) || *p == '\'' || *p == '"')
!                     p++;
!                 /* find end of value (not including any ending quote/comment!) */
!                 q = p;
!                 while (*q &&
!                        !(isspace((unsigned char) *q) ||
!                          *q == '\'' || *q == '"' || *q == '#'))
!                     q++;
!                 /* and save the argument value */
!                 strlcpy(portstr, p, Min((q - p) + 1, sizeof(portstr)));
!                 /* keep looking, maybe there is another */
              }
          }
-     }
-
-     /* Check environment */
-     if (!portstr[0] && getenv("PGPORT") != NULL)
-         strlcpy(portstr, getenv("PGPORT"), sizeof(portstr));

!     /* Else use compiled-in default */
!     if (!portstr[0])
!         snprintf(portstr, sizeof(portstr), "%d", DEF_PGPORT);
!
!     /*
!      * We need to set a connect timeout otherwise on Windows the SCM will
!      * probably timeout first
!      */
!     snprintf(connstr, sizeof(connstr),
!              "dbname=postgres port=%s connect_timeout=5", portstr);

-     for (i = 0; i < wait_seconds; i++)
-     {
-         ret = PQping(connstr);
-         if (ret == PQPING_OK || ret == PQPING_NO_ATTEMPT)
-             break;
          /* No response, or startup still in process; wait */
  #if defined(WIN32)
          if (do_checkpoint)
--- 403,509 ----
  static PGPing
  test_postmaster_connection(bool do_checkpoint)
  {
+     int            portnum = 0;
+     char        socket_dir[MAXPGPATH];
+     char        connstr[MAXPGPATH + 256];
      PGPing        ret = PQPING_OK;    /* assume success for wait == zero */
+     char      **optlines;
      int            i;

!     socket_dir[0] = '\0';
!     connstr[0] = '\0';
!
!     for (i = 0; i < wait_seconds; i++)
      {
!         /* Do we need a connection string? */
!         if (connstr[0] == '\0')
          {
!             /*
!              *     The number of lines in postmaster.pid tells us several things:
!              *
!              *    # of lines
!              *        0    lock file created but status not written
!              *        2    pre-9.1 server, shared memory not created
!              *        3    pre-9.1 server, shared memory created
!              *        4    9.1+ server, shared memory not created
!              *        5    9.1+ server, shared memory created
!              *
!              *    For pre-9.1 Unix servers, we grab the port number from the
!              *    shmem key (first value on line 3).  Pre-9.1 Win32 has no
!              *    written shmem key, so we fail.  9.1+ writes both the port
!              *    number and socket address in the file for us to use.
!              */
!
!             /* Try to read a completed postmaster.pid file */
!             if ((optlines = readfile(pid_file)) != NULL &&
!                 optlines[0] != NULL &&
!                 optlines[1] != NULL &&
!                 optlines[2] != NULL)
!             {
!                 /* A 3-line file? */
!                 if (optlines[3] == NULL)
!                 {
!                     /*
!                      *    Pre-9.1:  on Unix, we get the port number by
!                      *    deriving it from the shmem key (the first number on
!                      *    on the line);  see
!                      *    miscinit.c::RecordSharedMemoryInLockFile().
!                      */
!                     portnum = atoi(optlines[2]) / 1000;
!                     /* Win32 does not give us a shmem key, so we fail. */
!                     if (portnum == 0)
!                     {
!                         write_stderr(_("%s: -w option is not supported on this platform\nwhen connecting to a pre-9.1
server\n"),
!                                      progname);
!                         return PQPING_NO_ATTEMPT;
!                     }
!                 }
!                 else    /* 9.1+ server */
!                 {
!                     portnum = atoi(optlines[2]);
!
!                     /* Get socket directory, if specified. */
!                     if (optlines[3][0] != '\n')
!                     {
!                         /*
!                          *    While unix_socket_directory can accept relative
!                          *    directories, libpq's host must have a leading slash
!                          *    to indicate a socket directory.
!                          */
!                         if (optlines[3][0] != '/')
!                         {
!                             write_stderr(_("%s: -w option cannot use a relative socket directory specification\n"),
!                                          progname);
!                             return PQPING_NO_ATTEMPT;
!                         }
!                         strlcpy(socket_dir, optlines[3], MAXPGPATH);
!                         /* remove newline */
!                         if (strchr(socket_dir, '\n') != NULL)
!                             *strchr(socket_dir, '\n') = '\0';
!                     }
!                 }

!                 /*
!                  * We need to set connect_timeout otherwise on Windows the
!                  * Service Control Manager (SCM) will probably timeout first.
!                  */
!                 snprintf(connstr, sizeof(connstr),
!                          "dbname=postgres port=%d connect_timeout=5", portnum);

!                 if (socket_dir[0] != '\0')
!                     snprintf(connstr + strlen(connstr), sizeof(connstr) - strlen(connstr),
!                         " host='%s'", socket_dir);
              }
          }

!         /* If we have a connection string, ping the server */
!         if (connstr[0] != '\0')
!         {
!             ret = PQping(connstr);
!             if (ret == PQPING_OK || ret == PQPING_NO_ATTEMPT)
!                 break;
!         }

          /* No response, or startup still in process; wait */
  #if defined(WIN32)
          if (do_checkpoint)
*************** main(int argc, char **argv)
*** 2009,2015 ****
      {
          snprintf(postopts_file, MAXPGPATH, "%s/postmaster.opts", pg_data);
          snprintf(pid_file, MAXPGPATH, "%s/postmaster.pid", pg_data);
-         snprintf(conf_file, MAXPGPATH, "%s/postgresql.conf", pg_data);
          snprintf(backup_file, MAXPGPATH, "%s/backup_label", pg_data);
          snprintf(recovery_file, MAXPGPATH, "%s/recovery.conf", pg_data);
      }
--- 2002,2007 ----

Re: pg_ctl and port number detection

From
Bruce Momjian
Date:
Bruce Momjian wrote:
> Tom Lane wrote:
> > Actually, if we're going to do this at all, we should do
> > 
> >     pid
> >     datadir
> >     port
> >     socketdir
> >     ... here be dragons ...
> > 
> > so that pg_ctl doesn't have to assume the server is running with a
> > default value of unix_socket_dir.  Not sure what to put in the fourth
> > line on Windows though ... maybe just leave it empty?
> 
> OK, here is a patch that adds the port number and optionally socket
> directory location to postmaster.pid, and modifies pg_ctl to use that
> information.  I throw an error on using Win32 with pre-9.1 servers
> because we can't get the port number from that file.
> 
> This removes some crufty code from pg_ctl and removes dependency on
> serveral user-configurable settings that we added as a work-around.
> 
> This will allow pg_ctl -w to work more reliabily than it did in the
> past.

Applied.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: pg_ctl and port number detection

From
Fujii Masao
Date:
On Fri, Dec 24, 2010 at 11:47 PM, Bruce Momjian <bruce@momjian.us> wrote:
> Applied.

storage.sgml seems to need to be updated.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: pg_ctl and port number detection

From
Bruce Momjian
Date:
Fujii Masao wrote:
> On Fri, Dec 24, 2010 at 11:47 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > Applied.
>
> storage.sgml seems to need to be updated.

Ah, I see that now, thanks.  Patch attached and applied.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index 260df3d..cda7f64 100644
*** /tmp/jT4w0b_storage.sgml    Mon Dec 27 15:19:18 2010
--- doc/src/sgml/storage.sgml    Mon Dec 27 15:18:51 2010
*************** last started with</entry>
*** 116,123 ****

  <row>
   <entry><filename>postmaster.pid</></entry>
!  <entry>A lock file recording the current server PID and shared memory
! segment ID (not present after server shutdown)</entry>
  </row>

  </tbody>
--- 116,124 ----

  <row>
   <entry><filename>postmaster.pid</></entry>
!  <entry>A lock file recording the current postmaster process id (PID),
!  cluster data directory, port number, Unix domain socket directory,
!  and shared memory segment ID</entry>
  </row>

  </tbody>