Thread: Bug with pg_ctl -w/wait and config-only directories

Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
In researching pg_ctl -w/wait mode for pg_upgrade, I found that pg_ctl
-w's handling of configuration-only directories is often incorrect.  For
example, 'pg_ctl -w stop' checks for the postmaster.pid file to
determine when the server is shut down, but there is no postmaster.pid
file in the config directory, so it fails, i.e. does nothing.  What is
interesting is that specifying the real data directory does work.  

Similarly, pg_ctl references these data directory files:
       snprintf(postopts_file, MAXPGPATH, "%s/postmaster.opts", pg_data);       snprintf(backup_file, MAXPGPATH,
"%s/backup_label",pg_data);       snprintf(recovery_file, MAXPGPATH, "%s/recovery.conf", pg_data);
snprintf(promote_file,MAXPGPATH, "%s/promote", pg_data);
 

I assume things that use these files also don't work for config-only
directories.  

You might think that you can always just specify the real data
directory, but that doesn't work if the server has to be started because
you need to point to postgresql.conf.  pg_ctl -w restart is a classic
case of something that needs both the config directory and the real data
directory.  Basically, this stuff all seems broken and needs to be fixed
or documented.

What is even worse is that pre-9.1, pg_ctl start would read ports from
the pg_ctl -o command line, but in 9.1 we changed this to force reading
the postmaster.pid file to find the port number and socket directory
location --- meaning, new in PG 9.1, 'pg_ctl -w start' doesn't work for
config-only directories either.  And, we can't easily connect to the
server to get the 'data_directory' because we need to read
postmaster.pid to get the connection settings.  :-(

I think this points to the need for a command-line tool to output the
data directory location;  I am not sure what to do about the new 9.1
breakage.

pg_upgrade can work around these issues by starting using the config
directory and stopping using the real data directory, but it cannot work
around the 9.1 pg_ctl -w start problem for config-only directories.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
"Mr. Aaron W. Swenson"
Date:
On Sat, Oct 01, 2011 at 02:08:33PM -0400, Bruce Momjian wrote:
> In researching pg_ctl -w/wait mode for pg_upgrade, I found that pg_ctl
> -w's handling of configuration-only directories is often incorrect.  For
> example, 'pg_ctl -w stop' checks for the postmaster.pid file to
> determine when the server is shut down, but there is no postmaster.pid
> file in the config directory, so it fails, i.e. does nothing.  What is
> interesting is that specifying the real data directory does work.
>
> Similarly, pg_ctl references these data directory files:
>
>         snprintf(postopts_file, MAXPGPATH, "%s/postmaster.opts", pg_data);
>         snprintf(backup_file, MAXPGPATH, "%s/backup_label", pg_data);
>         snprintf(recovery_file, MAXPGPATH, "%s/recovery.conf", pg_data);
>         snprintf(promote_file, MAXPGPATH, "%s/promote", pg_data);
>
> I assume things that use these files also don't work for config-only
> directories.
>
> You might think that you can always just specify the real data
> directory, but that doesn't work if the server has to be started because
> you need to point to postgresql.conf.  pg_ctl -w restart is a classic
> case of something that needs both the config directory and the real data
> directory.  Basically, this stuff all seems broken and needs to be fixed
> or documented.
>
> What is even worse is that pre-9.1, pg_ctl start would read ports from
> the pg_ctl -o command line, but in 9.1 we changed this to force reading
> the postmaster.pid file to find the port number and socket directory
> location --- meaning, new in PG 9.1, 'pg_ctl -w start' doesn't work for
> config-only directories either.  And, we can't easily connect to the
> server to get the 'data_directory' because we need to read
> postmaster.pid to get the connection settings.  :-(
>
> I think this points to the need for a command-line tool to output the
> data directory location;  I am not sure what to do about the new 9.1
> breakage.
>
> pg_upgrade can work around these issues by starting using the config
> directory and stopping using the real data directory, but it cannot work
> around the 9.1 pg_ctl -w start problem for config-only directories.
>
> --
>   Bruce Momjian  <bruce@momjian.us>        http://momjian.us
>   EnterpriseDB                             http://enterprisedb.com
>
>   + It's impossible for everything to be true. +

I went through several iterations trying to find a command that can work
the way we'd like it to. (Essentially is works the way you're describing
it should.) So, in Gentoo, for the initscript, we have this really ugly
command to start the server:
   su -l postgres \       -c "env PGPORT=\"${PGPORT}\" ${PG_EXTRA_ENV} \           /usr/lib/postgresql-9.0/bin/pg_ctl \
         start ${WAIT_FOR_START} -t ${START_TIMEOUT} -s -D ${DATA_DIR} \           -o '-D ${PGDATA}
--data-directory=${DATA_DIR}\               --silent-mode=true ${PGOPTS}'" 

And to stop the server:
   su -l postgres \       -c "env PGPORT=\"${PGPORT}\" ${PG_EXTRA_ENV} \           /usr/lib/postgresql-9.0/bin/pg_ctl \
         stop ${WAIT_FOR_STOP} -t ${NICE_TIMEOUT} -s -D ${DATA_DIR} \           -m smart" 

The default values for these are:
   PGPORT='5432'   PG_EXTRA_ENV=''   WAIT_FOR_START='-w'   START_TIMEOUT='60'   WAIT_FOR_STOP='-w'   NICE_TIMEOUT='60'
DATA_DIR='/var/lib/postgresql/9.0/data'   PGDATA='/etc/postgresql-9.0'   PGOPTS='' 

We don't use 'pg_ctl restart', instead we stop and then start the
server. So, I don't have an answer for that. I'd imagine passing '-D
${DATA_DIR}' would do the trick there as well.

Of course, simplifying this a bit would be welcome.

--
Mr. Aaron W. Swenson
Gentoo Linux Developer
Email    : titanofold@gentoo.org
GnuPG FP : 2C00 7719 4F85 FB07 A49C  0E31 5713 AA03 D1BB FDA0
GnuPG ID : D1BBFDA0

Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Mr. Aaron W. Swenson wrote:
> I went through several iterations trying to find a command that can work
> the way we'd like it to. (Essentially is works the way you're describing
> it should.) So, in Gentoo, for the initscript, we have this really ugly
> command to start the server:
> 
>     su -l postgres \
>         -c "env PGPORT=\"${PGPORT}\" ${PG_EXTRA_ENV} \
>             /usr/lib/postgresql-9.0/bin/pg_ctl \
>             start ${WAIT_FOR_START} -t ${START_TIMEOUT} -s -D ${DATA_DIR} \
>             -o '-D ${PGDATA} --data-directory=${DATA_DIR} \
>                 --silent-mode=true ${PGOPTS}'"
> 
> And to stop the server:
> 
>     su -l postgres \
>         -c "env PGPORT=\"${PGPORT}\" ${PG_EXTRA_ENV} \
>             /usr/lib/postgresql-9.0/bin/pg_ctl \
>             stop ${WAIT_FOR_STOP} -t ${NICE_TIMEOUT} -s -D ${DATA_DIR} \
>             -m smart"
> 
> The default values for these are:
> 
>     PGPORT='5432'
>     PG_EXTRA_ENV=''
>     WAIT_FOR_START='-w'
>     START_TIMEOUT='60'
>     WAIT_FOR_STOP='-w'
>     NICE_TIMEOUT='60'
>     DATA_DIR='/var/lib/postgresql/9.0/data'
>     PGDATA='/etc/postgresql-9.0'
>     PGOPTS=''
> 
> We don't use 'pg_ctl restart', instead we stop and then start the
> server. So, I don't have an answer for that. I'd imagine passing '-D
> ${DATA_DIR}' would do the trick there as well.
> 
> Of course, simplifying this a bit would be welcome. 

What exactly is your question?  You are not using a config-only
directory but the real data directory, so it should work fine.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Fujii Masao
Date:
On Sun, Oct 2, 2011 at 7:54 AM, Bruce Momjian <bruce@momjian.us> wrote:
> What exactly is your question?  You are not using a config-only
> directory but the real data directory, so it should work fine.

No. He is using PGDATA (= /etc/postgresql-9.0) as a config-only
directory, and DATA_DIR (= /var/lib/postgresql/9.0/data) as a
real data directory.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Fujii Masao wrote:
> On Sun, Oct 2, 2011 at 7:54 AM, Bruce Momjian <bruce@momjian.us> wrote:
> > What exactly is your question? ?You are not using a config-only
> > directory but the real data directory, so it should work fine.
> 
> No. He is using PGDATA (= /etc/postgresql-9.0) as a config-only
> directory, and DATA_DIR (= /var/lib/postgresql/9.0/data) as a
> real data directory.

Wow, I see what you mean now!  So the user already figured out it was
broken and used the workaround I recently discovered?  Was this ever
reported to the community?  If so, I never saw it.

So, in testing, I see it is even more broken than I thought.  Not only
is pg_ctl -w broken for start/stop for config-only installs, but pg_ctl
stop (no -w) is also broken because it can't find the postmaster.pid
file to check or use to get the pid to send the signal.  pg_ctl reload
and restart are similarly broken.  :-(

And it gets worse.  The example supplied by the Gentoo developer shows a
use case where the data directory is not even specified in the
configuration file but rather on the command line:
   su -l postgres \       -c "env PGPORT=\"${PGPORT}\" ${PG_EXTRA_ENV} \           /usr/lib/postgresql-9.0/bin/pg_ctl \
         start ${WAIT_FOR_START} -t ${START_TIMEOUT} -s -D ${DATA_DIR} \           -o '-D ${PGDATA}
--data-directory=${DATA_DIR}\               --silent-mode=true ${PGOPTS}'"
 

In this case, dumping the postgresql.conf file settings is not going to
help --- there is nothing in the config directory that is going to point
us to the data directory --- it exists only in the process arguments.

Frankly, I am confused how this breakage has gone unreported for so long.

Our current TODO item is:
Allow pg_ctl to work properly with configuration files located outsidethe PGDATA directory    pg_ctl can not read the
pidfile because it isn't located in theconfig directory but in the PGDATA directory. The solution is to allowpg_ctl to
readand understand postgresql.conf to find the data_directoryvalue.           BUG #5103: "pg_ctl -w (re)start" fails
withcustomunix_socket_directory 
 

While this is accurate, it certainly is missing much of the breakage. 
Finding a non-standard socket directory is the least of our problems
with config-only directories (even standard settings don't work), and
reading the config file is not enough of a solution because of the
possible passing of parameters on the command line.

To add even more complexity, imagine someone using the same config
directory for several data/cluster directories, and just passing a
unique --data-directory for each one on start --- in that case,
specifying the config directory is not sufficiently unique to specify
which data directory.  It seems we would need some way to pass the data
directory to pg_ctl, perhaps via -o, but parsing that was something we
have tried to avoid (there may be no other choice), and it would have to
be supplied for start and stop.

The only conclusion I can come up with is that we need to be able to
dump postgresql.conf's data_directory, but also to read it from the
command line.

I am starting to question the value of config-only directories if pg_ctl
stop doesn't work, or you have to specify a different directory for
start and stop.  Writing a second postmaster.pid file into the config
directory would help, but it would break with shared-config setups and I
don't think we can assume we have write permission on the config
directory.

What are config-only directories buying us that we can't get from
telling users to use symlinks and point to the data directory directly?

Did we not think of these things when we designed config-only
directories?  I don't even see this problem mentioned in our
documentation.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> I am starting to question the value of config-only directories if pg_ctl
> stop doesn't work, or you have to specify a different directory for
> start and stop.

Yup.

> Did we not think of these things when we designed config-only
> directories?  I don't even see this problem mentioned in our
> documentation.

Yeah, we did.  The people who were lobbying for the feature didn't care,
or possibly thought that somebody would fix it for them later.
        regards, tom lane


Re: Bug with pg_ctl -w/wait and config-only directories

From
Alvaro Herrera
Date:
Excerpts from Tom Lane's message of lun oct 03 12:34:22 -0300 2011:
> Bruce Momjian <bruce@momjian.us> writes:
> > I am starting to question the value of config-only directories if pg_ctl
> > stop doesn't work, or you have to specify a different directory for
> > start and stop.
> 
> Yup.
> 
> > Did we not think of these things when we designed config-only
> > directories?  I don't even see this problem mentioned in our
> > documentation.
> 
> Yeah, we did.  The people who were lobbying for the feature didn't care,
> or possibly thought that somebody would fix it for them later.

I think the main proponents are the Debian guys, and they don't use
pg_ctl because they have their own pg_ctlcluster.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> 
> Excerpts from Tom Lane's message of lun oct 03 12:34:22 -0300 2011:
> > Bruce Momjian <bruce@momjian.us> writes:
> > > I am starting to question the value of config-only directories if pg_ctl
> > > stop doesn't work, or you have to specify a different directory for
> > > start and stop.
> > 
> > Yup.
> > 
> > > Did we not think of these things when we designed config-only
> > > directories?  I don't even see this problem mentioned in our
> > > documentation.
> > 
> > Yeah, we did.  The people who were lobbying for the feature didn't care,
> > or possibly thought that somebody would fix it for them later.
> 
> I think the main proponents are the Debian guys, and they don't use
> pg_ctl because they have their own pg_ctlcluster.

OK, so it is as messed up as I thought.

I am all fine for people lobbying for features, but not if they don't
work with our tools.  pg_upgrade is certainly not going to use the
Debian start/stop tools unless Debian patches pg_upgrade.

So someone thought we would eventually fix the tools?  I am unclear
exactly how to fix much of this.  Even documenting some workarounds
seems impossible, e.g. pg_ctl restart.

I can't see any feature config-only directories adds that can't be
accomplished by symlinks.  Even the ability to use a single
configuration file for multiple clusters can be done.

In summary, here is what I have found that works or is impossible with
config-only directories:
pg_ctl start        specify config directorypg_ctl -w start        impossiblepg_ctl restart        impossiblepg_ctl
stop       specify real data dirpg_ctl -w stop        specify real data dirpg_ctl reload        specify real data dir
 

Config-only directories seem to be only adding confusion.  All possible
solutions seem to be adding more code and user requirements, which the
creation of symlinks avoids.

Is it time for me to ask on 'general' if removal of this feature is
warranted?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Config-only directories seem to be only adding confusion.  All possible
> solutions seem to be adding more code and user requirements, which the
> creation of symlinks avoids.

> Is it time for me to ask on 'general' if removal of this feature is
> warranted?

Well, the way we could fix it is to invent the parse-the-config-files
option that was alluded to recently.  Then pg_ctl would continue to
take the -D switch or PGDATA environment variable with the same meaning
that the postmaster attaches to it, and would do something like
postgres --print-config-value=data_directory -D $PGDATA

to extract the actual location of the data directory.

Whether this is worth the trouble is highly debatable IMO.  One obvious
risk factor for pg_ctl stop/restart is that the current contents of
postgresql.conf might not match what they were when the postmaster was
started.

I was never exactly thrilled with the separate-config-directory design
to start with, so I'm probably not the person to opine on whether we
could get away with removing it.
        regards, tom lane


Re: Bug with pg_ctl -w/wait and config-only directories

From
Peter Eisentraut
Date:
On mån, 2011-10-03 at 11:27 -0400, Bruce Momjian wrote:
> Frankly, I am confused how this breakage has gone unreported for so
> long.

Well, nobody is required to use pg_ctl, and for the longest time, it was
pg_ctl that was considered to be broken (for various other reasons) and
avoided in packaged init scripts.

Arguably, if push came to shove, pg_upgrade wouldn't really need to use
pg_ctl either.



Re: Bug with pg_ctl -w/wait and config-only directories

From
Dave Page
Date:
On Mon, Oct 3, 2011 at 7:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> On mån, 2011-10-03 at 11:27 -0400, Bruce Momjian wrote:
>> Frankly, I am confused how this breakage has gone unreported for so
>> long.
>
> Well, nobody is required to use pg_ctl,

You are if you wish to run as a service on Windows.


--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Bug with pg_ctl -w/wait and config-only directories

From
Andrew Dunstan
Date:

On 10/03/2011 12:54 PM, Tom Lane wrote:
>
> I was never exactly thrilled with the separate-config-directory design
> to start with, so I'm probably not the person to opine on whether we
> could get away with removing it.
>
>             

The horse has well and truly bolted. We'd have a major row if anyone 
tried to remove it. Let's not rehash old battles. Our only option is to 
make it work as best we can.

cheers

andrew


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Andrew Dunstan wrote:
> 
> 
> On 10/03/2011 12:54 PM, Tom Lane wrote:
> >
> > I was never exactly thrilled with the separate-config-directory design
> > to start with, so I'm probably not the person to opine on whether we
> > could get away with removing it.
> >
> >             
> 
> The horse has well and truly bolted. We'd have a major row if anyone 
> tried to remove it. Let's not rehash old battles. Our only option is to 
> make it work as best we can.

I disagree.  If people were using it we would have had many more bug
reports about pg_ctl not working.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Config-only directories seem to be only adding confusion.  All possible
> > solutions seem to be adding more code and user requirements, which the
> > creation of symlinks avoids.
> 
> > Is it time for me to ask on 'general' if removal of this feature is
> > warranted?
> 
> Well, the way we could fix it is to invent the parse-the-config-files
> option that was alluded to recently.  Then pg_ctl would continue to
> take the -D switch or PGDATA environment variable with the same meaning
> that the postmaster attaches to it, and would do something like
> 
>     postgres --print-config-value=data_directory -D $PGDATA
> 
> to extract the actual location of the data directory.

That works, assuming the server was not started with -o
'data_directory=/abc'.  The only workaround there would be to have
pg_ctl supply the -o, even on pg_ctl stop, and parse that in pg_ctl.

> Whether this is worth the trouble is highly debatable IMO.  One obvious
> risk factor for pg_ctl stop/restart is that the current contents of
> postgresql.conf might not match what they were when the postmaster was
> started.
> 
> I was never exactly thrilled with the separate-config-directory design
> to start with, so I'm probably not the person to opine on whether we
> could get away with removing it.

The entire thing seems logically broken, to the point where even if we
did get code working, few users would even understand it.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Andrew Dunstan
Date:

On 10/03/2011 02:15 PM, Bruce Momjian wrote:
> Andrew Dunstan wrote:
>>
>> On 10/03/2011 12:54 PM, Tom Lane wrote:
>>> I was never exactly thrilled with the separate-config-directory design
>>> to start with, so I'm probably not the person to opine on whether we
>>> could get away with removing it.
>>>
>>>             
>> The horse has well and truly bolted. We'd have a major row if anyone
>> tried to remove it. Let's not rehash old battles. Our only option is to
>> make it work as best we can.
> I disagree.  If people were using it we would have had many more bug
> reports about pg_ctl not working.
>

No, that's an indication people aren't using pg_ctl, not that they 
aren't using separate config dirs.

cheers

andrew


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> On m?n, 2011-10-03 at 11:27 -0400, Bruce Momjian wrote:
> > Frankly, I am confused how this breakage has gone unreported for so
> > long.
> 
> Well, nobody is required to use pg_ctl, and for the longest time, it was
> pg_ctl that was considered to be broken (for various other reasons) and
> avoided in packaged init scripts.

Yes, but I am now seeing that pg_ctl is really unfixable.  Is the
config-only directory really a valuable feature if pg_ctl does not work?

If we could document that pg_ctl (and pg_upgrade) doesn't work with
config-only directories, at least we would have a consistent API.  The
question is whether the config-only directory is useful with this
restriction.  Are people recording the postmaster pid somewhere when
they start it?  I doubt they are parsing the connection information we
added to postmaster.pid in 9.1.  Are they manually going into the
postmaster.pdi file and grabbing the first line?

> Arguably, if push came to shove, pg_upgrade wouldn't really need to use
> pg_ctl either.

It would have to implement the 'wait' mode inside pg_upgrade, and in
other applications that needs that behavior.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Dave Page
Date:
On Mon, Oct 3, 2011 at 7:15 PM, Bruce Momjian <bruce@momjian.us> wrote:
> Andrew Dunstan wrote:
>>
>>
>> On 10/03/2011 12:54 PM, Tom Lane wrote:
>> >
>> > I was never exactly thrilled with the separate-config-directory design
>> > to start with, so I'm probably not the person to opine on whether we
>> > could get away with removing it.
>> >
>> >
>>
>> The horse has well and truly bolted. We'd have a major row if anyone
>> tried to remove it. Let's not rehash old battles. Our only option is to
>> make it work as best we can.
>
> I disagree.  If people were using it we would have had many more bug
> reports about pg_ctl not working.

Debian/ubuntu packages and our own project infrastructure use it.
Though, there is a non-trivial script wrapping it, presumably to try
to make it work properly, and handle side-by-side installations of
different major versions.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Andrew Dunstan wrote:
> 
> 
> On 10/03/2011 02:15 PM, Bruce Momjian wrote:
> > Andrew Dunstan wrote:
> >>
> >> On 10/03/2011 12:54 PM, Tom Lane wrote:
> >>> I was never exactly thrilled with the separate-config-directory design
> >>> to start with, so I'm probably not the person to opine on whether we
> >>> could get away with removing it.
> >>>
> >>>             
> >> The horse has well and truly bolted. We'd have a major row if anyone
> >> tried to remove it. Let's not rehash old battles. Our only option is to
> >> make it work as best we can.
> > I disagree.  If people were using it we would have had many more bug
> > reports about pg_ctl not working.
> >
> 
> No, that's an indication people aren't using pg_ctl, not that they 
> aren't using separate config dirs.

So, you are saying that people who want config-only directories are just
not people who normally use pg_ctl, because if they were, they would
have reported the bug?  That seems unlikely.  I will admit the Gentoo
case is exactly that.

So we just document that config-only directories don't work for pg_ctl
and pg_upgrade?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Andrew Dunstan
Date:

On 10/03/2011 02:25 PM, Bruce Momjian wrote:
> Andrew Dunstan wrote:
>>
>> On 10/03/2011 02:15 PM, Bruce Momjian wrote:
>>> Andrew Dunstan wrote:
>>>> On 10/03/2011 12:54 PM, Tom Lane wrote:
>>>>> I was never exactly thrilled with the separate-config-directory design
>>>>> to start with, so I'm probably not the person to opine on whether we
>>>>> could get away with removing it.
>>>>>
>>>>>             
>>>> The horse has well and truly bolted. We'd have a major row if anyone
>>>> tried to remove it. Let's not rehash old battles. Our only option is to
>>>> make it work as best we can.
>>> I disagree.  If people were using it we would have had many more bug
>>> reports about pg_ctl not working.
>>>
>> No, that's an indication people aren't using pg_ctl, not that they
>> aren't using separate config dirs.
> So, you are saying that people who want config-only directories are just
> not people who normally use pg_ctl, because if they were, they would
> have reported the bug?  That seems unlikely.  I will admit the Gentoo
> case is exactly that.

As Dave has pointed out there are many more people that use it, probably 
most notably Debian/Ubuntu users.

> So we just document that config-only directories don't work for pg_ctl
> and pg_upgrade?
>

I'd rather not if it can be avoided.

cheers

andrew




Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Andrew Dunstan wrote:
> 
> 
> On 10/03/2011 02:25 PM, Bruce Momjian wrote:
> > Andrew Dunstan wrote:
> >>
> >> On 10/03/2011 02:15 PM, Bruce Momjian wrote:
> >>> Andrew Dunstan wrote:
> >>>> On 10/03/2011 12:54 PM, Tom Lane wrote:
> >>>>> I was never exactly thrilled with the separate-config-directory design
> >>>>> to start with, so I'm probably not the person to opine on whether we
> >>>>> could get away with removing it.
> >>>>>
> >>>>>             
> >>>> The horse has well and truly bolted. We'd have a major row if anyone
> >>>> tried to remove it. Let's not rehash old battles. Our only option is to
> >>>> make it work as best we can.
> >>> I disagree.  If people were using it we would have had many more bug
> >>> reports about pg_ctl not working.
> >>>
> >> No, that's an indication people aren't using pg_ctl, not that they
> >> aren't using separate config dirs.
> > So, you are saying that people who want config-only directories are just
> > not people who normally use pg_ctl, because if they were, they would
> > have reported the bug?  That seems unlikely.  I will admit the Gentoo
> > case is exactly that.
> 
> As Dave has pointed out there are many more people that use it, probably 
> most notably Debian/Ubuntu users.
> 
> > So we just document that config-only directories don't work for pg_ctl
> > and pg_upgrade?
> >
> 
> I'd rather not if it can be avoided.

OK, please propose and "avoid" plan?  I can't come up with one that makes
any sense.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Magnus Hagander
Date:
On Mon, Oct 3, 2011 at 20:39, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> On 10/03/2011 02:25 PM, Bruce Momjian wrote:
>>
>> Andrew Dunstan wrote:
>>>
>>> On 10/03/2011 02:15 PM, Bruce Momjian wrote:
>>>>
>>>> Andrew Dunstan wrote:
>>>>>
>>>>> On 10/03/2011 12:54 PM, Tom Lane wrote:
>>>>>>
>>>>>> I was never exactly thrilled with the separate-config-directory design
>>>>>> to start with, so I'm probably not the person to opine on whether we
>>>>>> could get away with removing it.
>>>>>>
>>>>>>
>>>>>
>>>>> The horse has well and truly bolted. We'd have a major row if anyone
>>>>> tried to remove it. Let's not rehash old battles. Our only option is to
>>>>> make it work as best we can.
>>>>
>>>> I disagree.  If people were using it we would have had many more bug
>>>> reports about pg_ctl not working.
>>>>
>>> No, that's an indication people aren't using pg_ctl, not that they
>>> aren't using separate config dirs.
>>
>> So, you are saying that people who want config-only directories are just
>> not people who normally use pg_ctl, because if they were, they would
>> have reported the bug?  That seems unlikely.  I will admit the Gentoo
>> case is exactly that.
>
> As Dave has pointed out there are many more people that use it, probably
> most notably Debian/Ubuntu users.
>
>> So we just document that config-only directories don't work for pg_ctl
>> and pg_upgrade?
>>
>
> I'd rather not if it can be avoided.

I think we "can live" with pg_ctl not working - since that problem has
already been solved by these people - at least partially.

Getting pg_upgrade to work would be a *lot* more important.

I'm not sure how big the overlap is - would it be easier if you moved
the required functionality into pg_upgrade itself, as you mentioned at
some point? As in, would it be easier to fix the config-only directory
case for the limited subset of functionality that pg_upgrade needs?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: Bug with pg_ctl -w/wait and config-only directories

From
Alvaro Herrera
Date:
Excerpts from Bruce Momjian's message of lun oct 03 15:23:47 -0300 2011:
> Peter Eisentraut wrote:
> > On m?n, 2011-10-03 at 11:27 -0400, Bruce Momjian wrote:
> > > Frankly, I am confused how this breakage has gone unreported for so
> > > long.
> > 
> > Well, nobody is required to use pg_ctl, and for the longest time, it was
> > pg_ctl that was considered to be broken (for various other reasons) and
> > avoided in packaged init scripts.
> 
> Yes, but I am now seeing that pg_ctl is really unfixable.  Is the
> config-only directory really a valuable feature if pg_ctl does not work?
> 
> If we could document that pg_ctl (and pg_upgrade) doesn't work with
> config-only directories, at least we would have a consistent API.  The
> question is whether the config-only directory is useful with this
> restriction.

Evidently people that use config-only dirs don't care all that much
about pg_ctl; we'd have a lot of bugs about it otherwise.  But I don't
think that's the case for pg_upgrade.  I think that simply dictating the
combination of conf-only dirs and pg_upgrade doesn't work is not going
to be a very popular choice, particularly if there's a simple workaround
such as adding a symlink.  (This makes me wonder, though, we don't we
require that said symlink is always in place; maybe have postmaster
create it automatically if it's not present?)

My guess is that we could fix the simple case (the one that doesn't
involve a "-o datadir" option) with the parse-and-report option that has
been mentioned, and dictate that the other one doesn't work.  That's
much less likely to cause a problem in practice.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Magnus Hagander wrote:
> >> So, you are saying that people who want config-only directories are just
> >> not people who normally use pg_ctl, because if they were, they would
> >> have reported the bug? ?That seems unlikely. ?I will admit the Gentoo
> >> case is exactly that.
> >
> > As Dave has pointed out there are many more people that use it, probably
> > most notably Debian/Ubuntu users.
> >
> >> So we just document that config-only directories don't work for pg_ctl
> >> and pg_upgrade?
> >>
> >
> > I'd rather not if it can be avoided.
> 
> I think we "can live" with pg_ctl not working - since that problem has
> already been solved by these people - at least partially.
> 
> Getting pg_upgrade to work would be a *lot* more important.

Well, the users are currently symlinking the config files into the real
data directory and running pg_upgrade that way --- we can document that
work-around.

> I'm not sure how big the overlap is - would it be easier if you moved
> the required functionality into pg_upgrade itself, as you mentioned at
> some point? As in, would it be easier to fix the config-only directory
> case for the limited subset of functionality that pg_upgrade needs?

Not really --- it is the -w/wait mode pg_upgrade needs. There is a lot
of new code in pg_ctl that reads the postmaster.pid file for socket
location, port number, etc, that doesn't make sense to duplicate. 
Frankly, there is the huge problem that they might specify the data
directory on the command line --- that would be a bear to support.

I think the only sane fix is to require pg_ctl and pg_upgrade to specify
the config _and_ real data directory.  The fact that PGDATA/-D currently
can point to a config-only directory means this will lead to a host of
confusion.

What would make more sense would be to add a PGCONFIG/-C parameter that
points to the config directory and make PGDATA/-D only point to the real
data directory.  Yes, it is more work for simple config-only installs,
but it allows pg_ctl and pg_upgrade to work with some sanity.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Alvaro Herrera
Date:
Excerpts from Bruce Momjian's message of lun oct 03 16:03:47 -0300 2011:

> > I'm not sure how big the overlap is - would it be easier if you moved
> > the required functionality into pg_upgrade itself, as you mentioned at
> > some point? As in, would it be easier to fix the config-only directory
> > case for the limited subset of functionality that pg_upgrade needs?
> 
> Not really --- it is the -w/wait mode pg_upgrade needs. There is a lot
> of new code in pg_ctl that reads the postmaster.pid file for socket
> location, port number, etc, that doesn't make sense to duplicate. 
> Frankly, there is the huge problem that they might specify the data
> directory on the command line --- that would be a bear to support.

How about creating a library with the controlling stuff that's shared by
pg_ctl and pg_upgrade?

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> 
> Excerpts from Bruce Momjian's message of lun oct 03 15:23:47 -0300 2011:
> > Peter Eisentraut wrote:
> > > On m?n, 2011-10-03 at 11:27 -0400, Bruce Momjian wrote:
> > > > Frankly, I am confused how this breakage has gone unreported for so
> > > > long.
> > > 
> > > Well, nobody is required to use pg_ctl, and for the longest time, it was
> > > pg_ctl that was considered to be broken (for various other reasons) and
> > > avoided in packaged init scripts.
> > 
> > Yes, but I am now seeing that pg_ctl is really unfixable.  Is the
> > config-only directory really a valuable feature if pg_ctl does not work?
> > 
> > If we could document that pg_ctl (and pg_upgrade) doesn't work with
> > config-only directories, at least we would have a consistent API.  The
> > question is whether the config-only directory is useful with this
> > restriction.
> 
> Evidently people that use config-only dirs don't care all that much
> about pg_ctl; we'd have a lot of bugs about it otherwise.  But I don't
> think that's the case for pg_upgrade.  I think that simply dictating the
> combination of conf-only dirs and pg_upgrade doesn't work is not going
> to be a very popular choice, particularly if there's a simple workaround
> such as adding a symlink.  (This makes me wonder, though, we don't we
> require that said symlink is always in place; maybe have postmaster
> create it automatically if it's not present?)
> 
> My guess is that we could fix the simple case (the one that doesn't
> involve a "-o datadir" option) with the parse-and-report option that has
> been mentioned, and dictate that the other one doesn't work.  That's
> much less likely to cause a problem in practice.

Well, we are unlikely to backpatch that parse-and-report option so it
would be +2 years before it could be expected to work for even
single-major-version upgrades.  That just seems unworkable.  Yeah. :-(

Yes, auto-creation of symlinks would be useful, but at that point pg_ctl
and pg_upgrade would have to use the real data directory, so I again
wonder what the config-only directory is getting us.

Why were people not using pg_ctl?  Because of the limitations which were
fixed in PG 9.1?  As Dave already said, windows already has to use pg_ctl.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> 
> Excerpts from Bruce Momjian's message of lun oct 03 16:03:47 -0300 2011:
> 
> > > I'm not sure how big the overlap is - would it be easier if you moved
> > > the required functionality into pg_upgrade itself, as you mentioned at
> > > some point? As in, would it be easier to fix the config-only directory
> > > case for the limited subset of functionality that pg_upgrade needs?
> > 
> > Not really --- it is the -w/wait mode pg_upgrade needs. There is a lot
> > of new code in pg_ctl that reads the postmaster.pid file for socket
> > location, port number, etc, that doesn't make sense to duplicate. 
> > Frankly, there is the huge problem that they might specify the data
> > directory on the command line --- that would be a bear to support.
> 
> How about creating a library with the controlling stuff that's shared by
> pg_ctl and pg_upgrade?

Fine, but again, unlikely to be backpatched, which means +2 years.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Robert Haas
Date:
On Mon, Oct 3, 2011 at 3:09 PM, Bruce Momjian <bruce@momjian.us> wrote:
> Well, we are unlikely to backpatch that parse-and-report option so it
> would be +2 years before it could be expected to work for even
> single-major-version upgrades.  That just seems unworkable.  Yeah. :-(

I'd like to see the patch first, but I am not convinced that we
couldn't back-patch this.  I am not a big fan of back-patching things
that are not bug fixes, but I think you can make a fairly reasonable
argument that this is a bug in pg_ctl, and therefore in pg_upgrade,
and that we should therefore fix it.  Frankly, I think the
parse-and-report option is the least of our troubles.  Implementing
that much without breaking anything seems like it should be quite
straightforward.  If that's all we need to get ourselves out of this
mess, then let's just go do it (carefully).

The trickier part is that you then have to make sure that - in the
course of fixing the cases where pg_ctl behaves properly today - you
don't make any backward-incompatible behavior changes.  Just for
example, we can't make a unilateral decision now that - in
split-config scenarios - pg_ctl should always be invoked with a -D
argument that points to the postgresql.conf directory rather than the
data directory, because per your email upthread there are cases where
that doesn't work today, and therefore people are probably pointing at
the data directory.  But we probably *could* get away with making
cases work that are currently broken - e.g. allow pg_ctl stop -D $FOO
to work if $FOO is *either* the config dir or the real data dir.  Now,
is that too much to back-patch?  Without having looked at the code,
I'm not sure, but it might turn out it's not that bad.  We've
certainly back-patched scarier stuff before when it's been necessary
to fix bugs - see, for example, commit
ceaf5052c6a7bee794211f5d4c503639bdf3dff0.

Furthermore, if we look at this and ultimately conclude that it's too
invasive to back-patch, all is not lost.  We have a recommended layout
for our tree, and the Ubuntu and Gentoo folks have decided not to use
it (which is perfectly fine), and they have installed various
workarounds for problems like "pg_ctl doesn't work well with that
directory layout".  This will be another scenario that they will need
to work around, and I'm guessing that they are more than capable of
doing that (if they aren't, perhaps they shouldn't have insisted on a
different layout in the first place... but I don't think that's the
case).  We can also document the workarounds for other users who have
this problem, and we can fix it for real in 9.2.  Sure, that will mean
it's 2+ years before people really start being able to take advantage
of the new features, but I don't think that makes it not worth doing.
Rome wasn't built in a day, and this didn't get broken in a day.  I'm
not abandoning all hope of a short-term fix, but even if we do give up
on that, I don't think that a long-term fix plus some documentation of
what to do meanwhile is a crazy approach to the problem.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Bug with pg_ctl -w/wait and config-only directories

From
Alvaro Herrera
Date:
Excerpts from Bruce Momjian's message of lun oct 03 16:09:08 -0300 2011:

> Alvaro Herrera wrote:

> > My guess is that we could fix the simple case (the one that doesn't
> > involve a "-o datadir" option) with the parse-and-report option that has
> > been mentioned, and dictate that the other one doesn't work.  That's
> > much less likely to cause a problem in practice.
> 
> Well, we are unlikely to backpatch that parse-and-report option so it
> would be +2 years before it could be expected to work for even
> single-major-version upgrades.  That just seems unworkable.  Yeah. :-(

If we don't do anything, then it's never going to work.  If we do it
today, we can have it working in the next release (9.2, right?).

"It doesn't work now but will work in the next release; and here's a
workaround that can get you out for now" is a useful if painful answer;
"it's never going to work" is a lot worse.

We've been in that sort of situation before, and the answer has always
been to fix the issue for future users.  Assuming the world doesn't end
next year (a safe bet if you ask me), those are going to be more common
that current users, so it's worth the hassle.

> Yes, auto-creation of symlinks would be useful, but at that point pg_ctl
> and pg_upgrade would have to use the real data directory, so I again
> wonder what the config-only directory is getting us.

Not mixing config stuff (in /etc per FHS) with server data (/var/lib,
again per FHS).  It's Debian policy anyway.  I don't judge whether this
is sane or not.  See
http://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard

> Why were people not using pg_ctl?  Because of the limitations which were
> fixed in PG 9.1?  As Dave already said, windows already has to use pg_ctl.

As I said, Debian has their own version pg_ctlcluster because of their
work to allow multiple major versions to work simultaneously in the same
server.  I dunno what about Gentoo.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> 
> Excerpts from Bruce Momjian's message of lun oct 03 16:09:08 -0300 2011:
> 
> > Alvaro Herrera wrote:
> 
> > > My guess is that we could fix the simple case (the one that doesn't
> > > involve a "-o datadir" option) with the parse-and-report option that has
> > > been mentioned, and dictate that the other one doesn't work.  That's
> > > much less likely to cause a problem in practice.
> > 
> > Well, we are unlikely to backpatch that parse-and-report option so it
> > would be +2 years before it could be expected to work for even
> > single-major-version upgrades.  That just seems unworkable.  Yeah. :-(
> 
> If we don't do anything, then it's never going to work.  If we do it
> today, we can have it working in the next release (9.2, right?).

No, old and new have to support this in both the postgres and pg_ctl
binaries, which is why I said 2+ years, e.g. going from 9.1 to 9.3 is
not going to work, unless we backpatch, and then we have to make sure
users are on later minor versions.

> "It doesn't work now but will work in the next release; and here's a
> workaround that can get you out for now" is a useful if painful answer;
> "it's never going to work" is a lot worse.
> 
> We've been in that sort of situation before, and the answer has always
> been to fix the issue for future users.  Assuming the world doesn't end
> next year (a safe bet if you ask me), those are going to be more common
> that current users, so it's worth the hassle.
> 
> > Yes, auto-creation of symlinks would be useful, but at that point pg_ctl
> > and pg_upgrade would have to use the real data directory, so I again
> > wonder what the config-only directory is getting us.
> 
> Not mixing config stuff (in /etc per FHS) with server data (/var/lib,
> again per FHS).  It's Debian policy anyway.  I don't judge whether this
> is sane or not.  See
> http://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard

Yes, but why not do this via symlinks?  The problem is pg_ctl has to
read server _state_ which cannot be put in a configuration directory,
and we don't even require the real data directory to be recorded in the
config file.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Magnus Hagander
Date:
On Mon, Oct 3, 2011 at 21:55, Bruce Momjian <bruce@momjian.us> wrote:
> Alvaro Herrera wrote:
>>
>> Excerpts from Bruce Momjian's message of lun oct 03 16:09:08 -0300 2011:
>>
>> > Alvaro Herrera wrote:
>>
>> > > My guess is that we could fix the simple case (the one that doesn't
>> > > involve a "-o datadir" option) with the parse-and-report option that has
>> > > been mentioned, and dictate that the other one doesn't work.  That's
>> > > much less likely to cause a problem in practice.
>> >
>> > Well, we are unlikely to backpatch that parse-and-report option so it
>> > would be +2 years before it could be expected to work for even
>> > single-major-version upgrades.  That just seems unworkable.  Yeah. :-(
>>
>> If we don't do anything, then it's never going to work.  If we do it
>> today, we can have it working in the next release (9.2, right?).
>
> No, old and new have to support this in both the postgres and pg_ctl
> binaries, which is why I said 2+ years, e.g. going from 9.1 to 9.3 is
> not going to work, unless we backpatch, and then we have to make sure
> users are on later minor versions.
>
>> "It doesn't work now but will work in the next release; and here's a
>> workaround that can get you out for now" is a useful if painful answer;
>> "it's never going to work" is a lot worse.
>>
>> We've been in that sort of situation before, and the answer has always
>> been to fix the issue for future users.  Assuming the world doesn't end
>> next year (a safe bet if you ask me), those are going to be more common
>> that current users, so it's worth the hassle.
>>
>> > Yes, auto-creation of symlinks would be useful, but at that point pg_ctl
>> > and pg_upgrade would have to use the real data directory, so I again
>> > wonder what the config-only directory is getting us.
>>
>> Not mixing config stuff (in /etc per FHS) with server data (/var/lib,
>> again per FHS).  It's Debian policy anyway.  I don't judge whether this
>> is sane or not.  See
>> http://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard
>
> Yes, but why not do this via symlinks?  The problem is pg_ctl has to
> read server _state_ which cannot be put in a configuration directory,
> and we don't even require the real data directory to be recorded in the
> config file.

Well, how does the server get from the config file to where the state
file is? Can we do it the same way, or even expose it to the tools
using a commandline parameter or something?

Or looking just from the pg_upgrade perspective, can we get enough
info out of a running backend before sending signals, or do we need it
on startup as well?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Robert Haas wrote:
> On Mon, Oct 3, 2011 at 3:09 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > Well, we are unlikely to backpatch that parse-and-report option so it
> > would be +2 years before it could be expected to work for even
> > single-major-version upgrades. ?That just seems unworkable. ?Yeah. :-(
> 
> I'd like to see the patch first, but I am not convinced that we
> couldn't back-patch this.  I am not a big fan of back-patching things
> that are not bug fixes, but I think you can make a fairly reasonable
> argument that this is a bug in pg_ctl, and therefore in pg_upgrade,
> and that we should therefore fix it.  Frankly, I think the
> parse-and-report option is the least of our troubles.  Implementing
> that much without breaking anything seems like it should be quite
> straightforward.  If that's all we need to get ourselves out of this
> mess, then let's just go do it (carefully).

We can't work on a patch until we have the defined behavior we want and
it should be understandable.

> The trickier part is that you then have to make sure that - in the
> course of fixing the cases where pg_ctl behaves properly today - you
> don't make any backward-incompatible behavior changes.  Just for
> example, we can't make a unilateral decision now that - in
> split-config scenarios - pg_ctl should always be invoked with a -D
> argument that points to the postgresql.conf directory rather than the
> data directory, because per your email upthread there are cases where
> that doesn't work today, and therefore people are probably pointing at
> the data directory.  But we probably *could* get away with making
> cases work that are currently broken - e.g. allow pg_ctl stop -D $FOO
> to work if $FOO is *either* the config dir or the real data dir.  Now,
> is that too much to back-patch?  Without having looked at the code,
> I'm not sure, but it might turn out it's not that bad.  We've
> certainly back-patched scarier stuff before when it's been necessary
> to fix bugs - see, for example, commit
> ceaf5052c6a7bee794211f5d4c503639bdf3dff0.

pg_ctl would have to do some detective work to see if PG_VERSION existed
in that directory and adjust its behavior --- the pg_upgrade patch I
posted does this kind of detection.  The goal is the change would happen
only for people using config-only directories, and when a config-only
directory is specified.

> Furthermore, if we look at this and ultimately conclude that it's too
> invasive to back-patch, all is not lost.  We have a recommended layout
> for our tree, and the Ubuntu and Gentoo folks have decided not to use
> it (which is perfectly fine), and they have installed various
> workarounds for problems like "pg_ctl doesn't work well with that
> directory layout".  This will be another scenario that they will need
> to work around, and I'm guessing that they are more than capable of
> doing that (if they aren't, perhaps they shouldn't have insisted on a
> different layout in the first place... but I don't think that's the
> case).  We can also document the workarounds for other users who have

Yes, they are using symlinks now to work around the pg_upgrade/pg_ctl
problem.

> this problem, and we can fix it for real in 9.2.  Sure, that will mean
> it's 2+ years before people really start being able to take advantage
> of the new features, but I don't think that makes it not worth doing.
> Rome wasn't built in a day, and this didn't get broken in a day.  I'm
> not abandoning all hope of a short-term fix, but even if we do give up
> on that, I don't think that a long-term fix plus some documentation of
> what to do meanwhile is a crazy approach to the problem.

I can't figure out what a non-crazy solution looks like.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Magnus Hagander wrote:
> Well, how does the server get from the config file to where the state
> file is? Can we do it the same way, or even expose it to the tools
> using a commandline parameter or something?

In that case (the Gentoo example), they use --data-directory
   su -l postgres \       -c "env PGPORT=\"${PGPORT}\" ${PG_EXTRA_ENV} \           /usr/lib/postgresql-9.0/bin/pg_ctl \
         start ${WAIT_FOR_START} -t ${START_TIMEOUT} -s -D ${DATA_DIR} \           -o '-D ${PGDATA}
--data-directory=${DATA_DIR}\               --silent-mode=true ${PGOPTS}'"
 

We could have pg_ctl read that information from the command line for
pg_ctl start, but for pg_ctl stop, we have no way of getting to that
value.  :-(  It is not like something is missing from the code.  The
user can start multiple clusters from a single config dir and the
information they give gives us no way to know which cluster they want,
or where is it located.  

Yes, this is where the system seems logically broken for our purposes.
It took me a while to understand this problem.

> Or looking just from the pg_upgrade perspective, can we get enough
> info out of a running backend before sending signals, or do we need it
> on startup as well?

pg_upgrade starts with all clusters stopped so there is no way to query
it --- it is a chicken and egg in that we don't know where the data
directory is to start it.


--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Alvaro Herrera
Date:
Excerpts from Bruce Momjian's message of lun oct 03 16:55:54 -0300 2011:
> 
> Alvaro Herrera wrote:
> > 
> > Excerpts from Bruce Momjian's message of lun oct 03 16:09:08 -0300 2011:
> > 
> > > Alvaro Herrera wrote:
> > 
> > > > My guess is that we could fix the simple case (the one that doesn't
> > > > involve a "-o datadir" option) with the parse-and-report option that has
> > > > been mentioned, and dictate that the other one doesn't work.  That's
> > > > much less likely to cause a problem in practice.
> > > 
> > > Well, we are unlikely to backpatch that parse-and-report option so it
> > > would be +2 years before it could be expected to work for even
> > > single-major-version upgrades.  That just seems unworkable.  Yeah. :-(
> > 
> > If we don't do anything, then it's never going to work.  If we do it
> > today, we can have it working in the next release (9.2, right?).
> 
> No, old and new have to support this in both the postgres and pg_ctl
> binaries, which is why I said 2+ years, e.g. going from 9.1 to 9.3 is
> not going to work, unless we backpatch, and then we have to make sure
> users are on later minor versions.

Well, so 2 releases.  Same argument.  I hope you're not trying to imply
that the world will end in 2013.  (Note that I don't necessarily
disagree with Robert Haas' opinion that we might be able to backpatch
the postmaster option).


> > > Yes, auto-creation of symlinks would be useful, but at that point pg_ctl
> > > and pg_upgrade would have to use the real data directory, so I again
> > > wonder what the config-only directory is getting us.
> > 
> > Not mixing config stuff (in /etc per FHS) with server data (/var/lib,
> > again per FHS).  It's Debian policy anyway.  I don't judge whether this
> > is sane or not.  See
> > http://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard
> 
> Yes, but why not do this via symlinks?

It doesn't matter now, because we have the functionality already.

> The problem is pg_ctl has to read server _state_ which cannot be put
> in a configuration directory, and we don't even require the real data
> directory to be recorded in the config file.

How so?  It certainly is in postgresql.conf.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Bug with pg_ctl -w/wait and config-only directoriesf

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> > The problem is pg_ctl has to read server _state_ which cannot be put
> > in a configuration directory, and we don't even require the real data
> > directory to be recorded in the config file.
> 
> How so?  It certainly is in postgresql.conf.

See my other email, e.g. -o 'data_directory=/abc'

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Alvaro Herrera
Date:
Excerpts from Bruce Momjian's message of lun oct 03 17:06:16 -0300 2011:
> 
> Magnus Hagander wrote:
> > Well, how does the server get from the config file to where the state
> > file is? Can we do it the same way, or even expose it to the tools
> > using a commandline parameter or something?
> 
> In that case (the Gentoo example), they use --data-directory
> 
>     su -l postgres \
>         -c "env PGPORT=\"${PGPORT}\" ${PG_EXTRA_ENV} \
>             /usr/lib/postgresql-9.0/bin/pg_ctl \
>             start ${WAIT_FOR_START} -t ${START_TIMEOUT} -s -D ${DATA_DIR} \
>             -o '-D ${PGDATA} --data-directory=${DATA_DIR} \
>                 --silent-mode=true ${PGOPTS}'"
> 
> We could have pg_ctl read that information from the command line for
> pg_ctl start, but for pg_ctl stop, we have no way of getting to that
> value.  :-(  It is not like something is missing from the code.  The
> user can start multiple clusters from a single config dir and the
> information they give gives us no way to know which cluster they want,
> or where is it located.  

Well, we have the Gentoo developer in this very thread.  I'm sure they
would fix their command line if we gave them a pg_ctl that worked.
Surely the package that contains the init script also contains pg_ctl,
so they would both be upgraded simultaneously.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> 
> Excerpts from Bruce Momjian's message of lun oct 03 17:06:16 -0300 2011:
> > 
> > Magnus Hagander wrote:
> > > Well, how does the server get from the config file to where the state
> > > file is? Can we do it the same way, or even expose it to the tools
> > > using a commandline parameter or something?
> > 
> > In that case (the Gentoo example), they use --data-directory
> > 
> >     su -l postgres \
> >         -c "env PGPORT=\"${PGPORT}\" ${PG_EXTRA_ENV} \
> >             /usr/lib/postgresql-9.0/bin/pg_ctl \
> >             start ${WAIT_FOR_START} -t ${START_TIMEOUT} -s -D ${DATA_DIR} \
> >             -o '-D ${PGDATA} --data-directory=${DATA_DIR} \
> >                 --silent-mode=true ${PGOPTS}'"
> > 
> > We could have pg_ctl read that information from the command line for
> > pg_ctl start, but for pg_ctl stop, we have no way of getting to that
> > value.  :-(  It is not like something is missing from the code.  The
> > user can start multiple clusters from a single config dir and the
> > information they give gives us no way to know which cluster they want,
> > or where is it located.  
> 
> Well, we have the Gentoo developer in this very thread.  I'm sure they
> would fix their command line if we gave them a pg_ctl that worked.
> Surely the package that contains the init script also contains pg_ctl,
> so they would both be upgraded simultaneously.

What is the fix?  If they started the server by using --data-directory,
pg_ctl stop has no way to find the postmaster.pid file, and hence stop
the server.  Are you suggesting we remove this ability?  We could
require the --data-directory to be specified for pg_ctl stop.  Of
course, just specifying the real data directory for pg_ctl stop works
just fine so what is their motivation to specify both the configuration
and real data directories?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Peter Eisentraut
Date:
On mån, 2011-10-03 at 19:11 +0100, Dave Page wrote:
> On Mon, Oct 3, 2011 at 7:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> > On mån, 2011-10-03 at 11:27 -0400, Bruce Momjian wrote:
> >> Frankly, I am confused how this breakage has gone unreported for so
> >> long.
> >
> > Well, nobody is required to use pg_ctl,
> 
> You are if you wish to run as a service on Windows.

OK, some people are more prone to use pg_ctl than others. ;-)



Re: Bug with pg_ctl -w/wait and config-only directories

From
Robert Haas
Date:
On Mon, Oct 3, 2011 at 3:59 PM, Bruce Momjian <bruce@momjian.us> wrote:
> pg_ctl would have to do some detective work to see if PG_VERSION existed
> in that directory and adjust its behavior --- the pg_upgrade patch I
> posted does this kind of detection.  The goal is the change would happen
> only for people using config-only directories, and when a config-only
> directory is specified.

Exactly.  That sounds like a good improvement for master even if
pg_upgrade were not at issue, and we should do it.  We can also
consider whether it makes sense to back-patch it so that pg_upgrade
can benefit from it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Bug with pg_ctl -w/wait and config-only directories

From
Alvaro Herrera
Date:
Excerpts from Bruce Momjian's message of lun oct 03 17:28:53 -0300 2011:
> 
> Alvaro Herrera wrote:

> > Well, we have the Gentoo developer in this very thread.  I'm sure they
> > would fix their command line if we gave them a pg_ctl that worked.
> > Surely the package that contains the init script also contains pg_ctl,
> > so they would both be upgraded simultaneously.
> 
> What is the fix?  If they started the server by using --data-directory,
> pg_ctl stop has no way to find the postmaster.pid file, and hence stop
> the server.  Are you suggesting we remove this ability?

I am suggesting they don't start it by using --data-directory in the
first place.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Bug with pg_ctl -w/wait and config-only directories

From
Peter Eisentraut
Date:
On mån, 2011-10-03 at 15:09 -0400, Bruce Momjian wrote:
> Why were people not using pg_ctl?  Because of the limitations which
> were fixed in PG 9.1?  As Dave already said, windows already has to
> use pg_ctl. 

Historically, pg_ctl has had a lot of limitations.  Just off the top of
my head, nonstandard ports used to break it, nonstandard socket
directories used to break it, nonstandard authentication setups used to
break it, the waiting business was unreliable, the stop modes were weird
and not flexible enough, the behavior in error cases does not conform to
LSB init script conventions, there were some race conditions that I
don't recall the details of right now.  And you had to keep a list of
exactly which of these bugs were addressed in which version.

Basically, pg_ctl is a neat convenience for interactive use for people
who don't want to write advanced shell constructs, but for writing a
robust init script, you can and should do better.  For me personally,
pg_ctl is somewhere between a toy, and annoyance, and a dangerous
instrument.

Obviously, pg_ctl is now a lot better than when it was started, but
that's the reason why it is not used in certain places.




Re: Bug with pg_ctl -w/wait and config-only directories

From
Andrew Dunstan
Date:

On 10/03/2011 04:41 PM, Peter Eisentraut wrote:
> On mån, 2011-10-03 at 15:09 -0400, Bruce Momjian wrote:
>> Why were people not using pg_ctl?  Because of the limitations which
>> were fixed in PG 9.1?  As Dave already said, windows already has to
>> use pg_ctl.
> Historically, pg_ctl has had a lot of limitations.  Just off the top of
> my head, nonstandard ports used to break it, nonstandard socket
> directories used to break it, nonstandard authentication setups used to
> break it, the waiting business was unreliable, the stop modes were weird
> and not flexible enough, the behavior in error cases does not conform to
> LSB init script conventions, there were some race conditions that I
> don't recall the details of right now.  And you had to keep a list of
> exactly which of these bugs were addressed in which version.


I'm not sure ancient history helps us much here.  Many of these went 
away long ago.


> Basically, pg_ctl is a neat convenience for interactive use for people
> who don't want to write advanced shell constructs, but for writing a
> robust init script, you can and should do better.  For me personally,
> pg_ctl is somewhere between a toy, and annoyance, and a dangerous
> instrument.
>
> Obviously, pg_ctl is now a lot better than when it was started, but
> that's the reason why it is not used in certain places.
>
>

Our job should be to make it better.

cheers

andrew


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Andrew Dunstan wrote:
> 
> 
> On 10/03/2011 04:41 PM, Peter Eisentraut wrote:
> > On m?n, 2011-10-03 at 15:09 -0400, Bruce Momjian wrote:
> >> Why were people not using pg_ctl?  Because of the limitations which
> >> were fixed in PG 9.1?  As Dave already said, windows already has to
> >> use pg_ctl.
> > Historically, pg_ctl has had a lot of limitations.  Just off the top of
> > my head, nonstandard ports used to break it, nonstandard socket
> > directories used to break it, nonstandard authentication setups used to
> > break it, the waiting business was unreliable, the stop modes were weird
> > and not flexible enough, the behavior in error cases does not conform to
> > LSB init script conventions, there were some race conditions that I
> > don't recall the details of right now.  And you had to keep a list of
> > exactly which of these bugs were addressed in which version.
> 
> 
> I'm not sure ancient history helps us much here.  Many of these went 
> away long ago.

Agreed.  You could argue that pg_ctl 9.1 is much better than anything
anyone would be able to craft in a script.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> 
> Excerpts from Bruce Momjian's message of lun oct 03 17:28:53 -0300 2011:
> > 
> > Alvaro Herrera wrote:
> 
> > > Well, we have the Gentoo developer in this very thread.  I'm sure they
> > > would fix their command line if we gave them a pg_ctl that worked.
> > > Surely the package that contains the init script also contains pg_ctl,
> > > so they would both be upgraded simultaneously.
> > 
> > What is the fix?  If they started the server by using --data-directory,
> > pg_ctl stop has no way to find the postmaster.pid file, and hence stop
> > the server.  Are you suggesting we remove this ability?
> 
> I am suggesting they don't start it by using --data-directory in the
> first place.

Agreed.  If you remove that, the logical problem goes away and it
becomes a simple problem of dumping the contents of postgresql.conf and
having pg_ctl (and pg_upgrade) use that.  Let me look at how much code
that would take.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Andrew Dunstan
Date:

On 10/03/2011 06:45 PM, Bruce Momjian wrote:
> Alvaro Herrera wrote:
>> Excerpts from Bruce Momjian's message of lun oct 03 17:28:53 -0300 2011:
>>> Alvaro Herrera wrote:
>>>> Well, we have the Gentoo developer in this very thread.  I'm sure they
>>>> would fix their command line if we gave them a pg_ctl that worked.
>>>> Surely the package that contains the init script also contains pg_ctl,
>>>> so they would both be upgraded simultaneously.
>>> What is the fix?  If they started the server by using --data-directory,
>>> pg_ctl stop has no way to find the postmaster.pid file, and hence stop
>>> the server.  Are you suggesting we remove this ability?
>> I am suggesting they don't start it by using --data-directory in the
>> first place.
> Agreed.  If you remove that, the logical problem goes away and it
> becomes a simple problem of dumping the contents of postgresql.conf and
> having pg_ctl (and pg_upgrade) use that.  Let me look at how much code
> that would take.
>

Yeah, this pattern can be changed to have a config file that reads:
   data_directory = '/path/to/data'   include '/path/to/common/config'

and I presume (or hope) that would meet your need, and not upset the FHS 
purists.


cheers

andrew


Re: Bug with pg_ctl -w/wait and config-only directories

From
Aidan Van Dyk
Date:
On Mon, Oct 3, 2011 at 7:10 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

>> Agreed.  If you remove that, the logical problem goes away and it
>> becomes a simple problem of dumping the contents of postgresql.conf and
>> having pg_ctl (and pg_upgrade) use that.  Let me look at how much code
>> that would take.
>>
>
> Yeah, this pattern can be changed to have a config file that reads:
>
>   data_directory = '/path/to/data'
>   include '/path/to/common/config'
>
> and I presume (or hope) that would meet your need, and not upset the FHS
> purists.

I kinda like the way the debian (and ubuntu) packages do it...

They start pg_ctl/postgres like:   ... -D /path/to/real-data/data-dir -c
config_file=/etc/postgresql/$INSTANCE/postgresql.conf

In /etc/postgresql/$INSTANCE/postgresql.conf, these are explictly set: data_directory=/path/to/real-data/data-dir
hba_file=/etc/postgresql/$INSTANCE/pg_hba.confident_file=/etc/postgresql/$INSTANCE/pg_ident.conf
external_pid_file=/var/run/postgresql/$INSTANCE.pid

It actually looks in /etc/postgresql/$INSTANCE/postgresql.conf to find
data_directory to use when invoking pg_ctl/postgres.

But, in my opinion, there is enough flexibility with postgresql's
config (and ability to pass un"recorded" options to postmaster at
startup too) that pg_upgrade can't guarantee it's going to figure out
every thing "automatically given a single $pgdata location to start
from".  That's simply not realistic.  Distros who do stranger things
than debian (and probably even Debian) are going to have to give their
users guidance on how to call pg_upgrade with their specific setup of
paths/configs/invocations.  It's simply that simple.

I'ld be happy enough if pg_upgrade could easily upgrade given a
datadir that had a postgresql.conf in it, or possibly a
postgresql.conf that had data_directory set in it.

Anything else, and I say it's responsibility of whoever scripted the
startup to be able to provide all the necessary information to
pg_upgrade (be it by extra command line options, or crafting a special
pg_data with symlinks that is more "normal").

a.
--
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Andrew Dunstan wrote:
> 
> 
> On 10/03/2011 06:45 PM, Bruce Momjian wrote:
> > Alvaro Herrera wrote:
> >> Excerpts from Bruce Momjian's message of lun oct 03 17:28:53 -0300 2011:
> >>> Alvaro Herrera wrote:
> >>>> Well, we have the Gentoo developer in this very thread.  I'm sure they
> >>>> would fix their command line if we gave them a pg_ctl that worked.
> >>>> Surely the package that contains the init script also contains pg_ctl,
> >>>> so they would both be upgraded simultaneously.
> >>> What is the fix?  If they started the server by using --data-directory,
> >>> pg_ctl stop has no way to find the postmaster.pid file, and hence stop
> >>> the server.  Are you suggesting we remove this ability?
> >> I am suggesting they don't start it by using --data-directory in the
> >> first place.
> > Agreed.  If you remove that, the logical problem goes away and it
> > becomes a simple problem of dumping the contents of postgresql.conf and
> > having pg_ctl (and pg_upgrade) use that.  Let me look at how much code
> > that would take.
> >
> 
> Yeah, this pattern can be changed to have a config file that reads:
> 
>     data_directory = '/path/to/data'
>     include '/path/to/common/config'
> 
> and I presume (or hope) that would meet your need, and not upset the FHS 
> purists.

Actually, the existing setup is fine as long as there is something that
tell us where to find the data directory.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Bruce Momjian wrote:
> Alvaro Herrera wrote:
> >
> > Excerpts from Bruce Momjian's message of lun oct 03 17:28:53 -0300 2011:
> > >
> > > Alvaro Herrera wrote:
> >
> > > > Well, we have the Gentoo developer in this very thread.  I'm sure they
> > > > would fix their command line if we gave them a pg_ctl that worked.
> > > > Surely the package that contains the init script also contains pg_ctl,
> > > > so they would both be upgraded simultaneously.
> > >
> > > What is the fix?  If they started the server by using --data-directory,
> > > pg_ctl stop has no way to find the postmaster.pid file, and hence stop
> > > the server.  Are you suggesting we remove this ability?
> >
> > I am suggesting they don't start it by using --data-directory in the
> > first place.
>
> Agreed.  If you remove that, the logical problem goes away and it
> becomes a simple problem of dumping the contents of postgresql.conf and
> having pg_ctl (and pg_upgrade) use that.  Let me look at how much code
> that would take.

OK, here is a patch that adds a -C option to the postmaster so any
config variable can be dumped, even while the server is running (there
is no security check because we don't have a user name at this point),
e.g.:

    postgres -D /pg_upgrade/tmp -C data_directory
    /u/pg/data

It also modifies pg_ctl to use this feature.  It works fine for pg_ctl
-w start/stop with a config-only directory, so this is certainly in the
right direction.  You can also use pg_ctl -o '--data_directory=/abc' and
it will be understood:

    pg_ctl -o '--data_directory=/u/pg/data' -D tmp start

If you used --data_directory to start the server, you will need to use
--data_directory to stop it, which seems reasonable.

Patch attached.  This was much simpler than I thought.  :-)

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
new file mode 100644
index 0a84d97..660458e
*** a/src/backend/postmaster/postmaster.c
--- b/src/backend/postmaster/postmaster.c
*************** bool        enable_bonjour = false;
*** 203,208 ****
--- 203,210 ----
  char       *bonjour_name;
  bool        restart_after_crash = true;

+ char         dump_config_variable[MAXPGPATH] = "";
+
  /* PIDs of special child processes; 0 when not running */
  static pid_t StartupPID = 0,
              BgWriterPID = 0,
*************** PostmasterMain(int argc, char *argv[])
*** 537,543 ****
       * tcop/postgres.c (the option sets should not conflict) and with the
       * common help() function in main/main.c.
       */
!     while ((opt = getopt(argc, argv, "A:B:bc:D:d:EeFf:h:ijk:lN:nOo:Pp:r:S:sTt:W:-:")) != -1)
      {
          switch (opt)
          {
--- 539,545 ----
       * tcop/postgres.c (the option sets should not conflict) and with the
       * common help() function in main/main.c.
       */
!     while ((opt = getopt(argc, argv, "A:B:bc:C:D:d:EeFf:h:ijk:lN:nOo:Pp:r:S:sTt:W:-:")) != -1)
      {
          switch (opt)
          {
*************** PostmasterMain(int argc, char *argv[])
*** 554,559 ****
--- 556,565 ----
                  IsBinaryUpgrade = true;
                  break;

+             case 'C':
+                 strlcpy(dump_config_variable, optarg, MAXPGPATH);
+                 break;
+
              case 'D':
                  userDoption = optarg;
                  break;
*************** PostmasterMain(int argc, char *argv[])
*** 728,733 ****
--- 734,746 ----
      if (!SelectConfigFiles(userDoption, progname))
          ExitPostmaster(2);

+     if (dump_config_variable[0] != '\0')
+     {
+         /* This allows anyone to read super-user config values. */
+         printf("%s\n", GetConfigOption(dump_config_variable, false, false));
+         ExitPostmaster(0);
+     }
+
      /* Verify that DataDir looks reasonable */
      checkDataDir();

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
new file mode 100644
index c7eac71..a5eae49
*** a/src/backend/tcop/postgres.c
--- b/src/backend/tcop/postgres.c
*************** process_postgres_switches(int argc, char
*** 3170,3176 ****
       * postmaster/postmaster.c (the option sets should not conflict) and with
       * the common help() function in main/main.c.
       */
!     while ((flag = getopt(argc, argv, "A:B:bc:D:d:EeFf:h:ijk:lN:nOo:Pp:r:S:sTt:v:W:-:")) != -1)
      {
          switch (flag)
          {
--- 3170,3176 ----
       * postmaster/postmaster.c (the option sets should not conflict) and with
       * the common help() function in main/main.c.
       */
!     while ((flag = getopt(argc, argv, "A:B:bc:C:D:d:EeFf:h:ijk:lN:nOo:Pp:r:S:sTt:v:W:-:")) != -1)
      {
          switch (flag)
          {
*************** process_postgres_switches(int argc, char
*** 3187,3192 ****
--- 3187,3196 ----
                  IsBinaryUpgrade = true;
                  break;

+             case 'C':
+                 /* ignored for consistency with postmaster */
+                 break;
+
              case 'D':
                  if (secure)
                      userDoption = strdup(optarg);
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
new file mode 100644
index 0dbdfe7..18a02ad
*** a/src/bin/pg_ctl/pg_ctl.c
--- b/src/bin/pg_ctl/pg_ctl.c
*************** static ShutdownMode shutdown_mode = SMAR
*** 81,86 ****
--- 81,87 ----
  static int    sig = SIGTERM;        /* default */
  static CtlCommand ctl_command = NO_COMMAND;
  static char *pg_data = NULL;
+ static char *pg_config = NULL;
  static char *pgdata_opt = NULL;
  static char *post_opts = NULL;
  static const char *progname;
*************** static void do_status(void);
*** 131,136 ****
--- 132,138 ----
  static void do_promote(void);
  static void do_kill(pgpid_t pid);
  static void print_msg(const char *msg);
+ static void adjust_data_dir(void);

  #if defined(WIN32) || defined(__CYGWIN__)
  static bool pgwin32_IsInstalled(SC_HANDLE);
*************** pgwin32_CommandLine(bool registration)
*** 1265,1274 ****
          strcat(cmdLine, "\"");
      }

!     if (pg_data)
      {
          strcat(cmdLine, " -D \"");
!         strcat(cmdLine, pg_data);
          strcat(cmdLine, "\"");
      }

--- 1267,1276 ----
          strcat(cmdLine, "\"");
      }

!     if (pg_config)
      {
          strcat(cmdLine, " -D \"");
!         strcat(cmdLine, pg_config);
          strcat(cmdLine, "\"");
      }

*************** set_starttype(char *starttypeopt)
*** 1886,1891 ****
--- 1888,1942 ----
  }
  #endif

+ /*
+  * adjust_data_dir
+  *
+  * If a configuration-only directory was specified, find the real data dir.
+  */
+ void
+ adjust_data_dir(void)
+ {
+     char        cmd[MAXPGPATH], filename[MAXPGPATH];
+     FILE       *fd;
+
+     /* If there is no postgresql.conf, it can't be a config-only dir */
+     snprintf(filename, sizeof(filename), "%s/postgresql.conf", pg_config);
+     if ((fd = fopen(filename, "r")) == NULL)
+         return;
+     fclose(fd);
+
+     /* If PG_VERSION exists, it can't be a config-only dir */
+     snprintf(filename, sizeof(filename), "%s/PG_VERSION", pg_config);
+     if ((fd = fopen(filename, "r")) != NULL)
+     {
+         fclose(fd);
+         return;
+     }
+
+     /* Must be a configuration directory, so find the data directory */
+
+     /* problem that we set this here and might cause it later not to change */
+     if (exec_path == NULL)
+         exec_path = find_other_exec_or_die(argv0, "postgres", PG_BACKEND_VERSIONSTR);
+
+     snprintf(cmd, MAXPGPATH, SYSTEMQUOTE "\"%s\" %s%s -C data_directory" SYSTEMQUOTE,
+              exec_path, pgdata_opt ? pgdata_opt : "", post_opts ?
+              post_opts : "");
+
+     fd = popen(cmd, "r");
+     if (fd == NULL || fgets(filename, sizeof(filename), fd) == NULL)
+     {
+         write_stderr(_("%s: cannot find the data directory using %s\n"), progname, exec_path);
+         exit(1);
+     }
+     pclose(fd);
+
+     if (strlen(filename) > 0 && filename[strlen(filename) - 1] == '\n')
+         filename[strlen(filename) - 1] = '\0';
+     free(pg_data);
+     pg_data = xstrdup(filename);
+ }
+

  int
  main(int argc, char **argv)
*************** main(int argc, char **argv)
*** 2120,2133 ****
      }

      /* Note we put any -D switch into the env var above */
!     pg_data = getenv("PGDATA");
!     if (pg_data)
      {
!         pg_data = xstrdup(pg_data);
!         canonicalize_path(pg_data);
      }

!     if (pg_data == NULL &&
          ctl_command != KILL_COMMAND && ctl_command != UNREGISTER_COMMAND)
      {
          write_stderr(_("%s: no database directory specified and environment variable PGDATA unset\n"),
--- 2171,2187 ----
      }

      /* Note we put any -D switch into the env var above */
!     pg_config = getenv("PGDATA");
!     if (pg_config)
      {
!         pg_config = xstrdup(pg_config);
!         canonicalize_path(pg_config);
!         pg_data = xstrdup(pg_config);
      }

!     adjust_data_dir();
!
!     if (pg_config == NULL &&
          ctl_command != KILL_COMMAND && ctl_command != UNREGISTER_COMMAND)
      {
          write_stderr(_("%s: no database directory specified and environment variable PGDATA unset\n"),

Re: Bug with pg_ctl -w/wait and config-only directories

From
Peter Eisentraut
Date:
On mån, 2011-10-03 at 17:12 -0400, Andrew Dunstan wrote:
> 
> On 10/03/2011 04:41 PM, Peter Eisentraut wrote:
> > On mån, 2011-10-03 at 15:09 -0400, Bruce Momjian wrote:
> >> Why were people not using pg_ctl?  Because of the limitations which
> >> were fixed in PG 9.1?  As Dave already said, windows already has to
> >> use pg_ctl.
> > Historically, pg_ctl has had a lot of limitations.  Just off the top of
> > my head, nonstandard ports used to break it, nonstandard socket
> > directories used to break it, nonstandard authentication setups used to
> > break it, the waiting business was unreliable, the stop modes were weird
> > and not flexible enough, the behavior in error cases does not conform to
> > LSB init script conventions, there were some race conditions that I
> > don't recall the details of right now.  And you had to keep a list of
> > exactly which of these bugs were addressed in which version.

> I'm not sure ancient history helps us much here.  Many of these went 
> away long ago.

But some of them are still there.  8.4 is still the packaged version in
some popular Linux distributions, and the fabled fixed-it-all version
9.1 was just released a few weeks ago.  So in current production
environments, pg_ctl is still an occasional liability.

> Our job should be to make it better.

Yeah, don't get me wrong, let's make it better.




Re: Bug with pg_ctl -w/wait and config-only directories

From
Peter Eisentraut
Date:
On mån, 2011-10-03 at 18:44 -0400, Bruce Momjian wrote:
> Agreed.  You could argue that pg_ctl 9.1 is much better than anything
> anyone would be able to craft in a script.

And what facts support that argument?

Anyway, this comes back to your favorite argument upthread.  pg_ctl has
had occasional problems in the past.  So people have created
alternatives that are customized for their particular use case.  And
some of those init scripts and the like support versions back from 8.1
or 8.2 up to last week.  So it will take 2, 3, or 5 years until they'd
even consider abandoning their frameworks for, say, a pg_ctl-based
solution.  And even then there will individual debates on whether it
would be worth doing.

So, the bottom line is, until now there has been no widespread "forced"
use of pg_ctl, outside of Windows and pg_upgrade.  PostgreSQL 9.0 was
not packaged in any released version of Debian or Ubuntu, so I guess not
a lot of people tried pg_upgrade in those configurations.  Then again,
9.1 isn't packaged in a released OS version either, of course, so I have
no idea why this is coming up exactly now.



Re: Bug with pg_ctl -w/wait and config-only directories

From
Robert Haas
Date:
On Mon, Oct 3, 2011 at 11:04 PM, Bruce Momjian <bruce@momjian.us> wrote:
> OK, here is a patch that adds a -C option to the postmaster so any
> config variable can be dumped, even while the server is running (there
> is no security check because we don't have a user name at this point),
> e.g.:
>
>        postgres -D /pg_upgrade/tmp -C data_directory
>        /u/pg/data

It seems both ugly and unnecessary to declare dump_config_variable as
char[MAXPGPATH].  MAXPGPATH doesn't seem like the right length for a
buffer intended to hold a GUC name, but in fact I don't think you need
a buffer at all.  I think you can just declare it as char * and say
dump_config_variable = optarg. getopt() doesn't overwrite the input
argument vector, does it?

Also, I think you should remove this comment:

+         /* This allows anyone to read super-user config values. */

It allows anyone to read super-user config values *who can already
read postgresql.conf*.  Duh.

> It also modifies pg_ctl to use this feature.  It works fine for pg_ctl
> -w start/stop with a config-only directory, so this is certainly in the
> right direction.  You can also use pg_ctl -o '--data_directory=/abc' and
> it will be understood:
>
>        pg_ctl -o '--data_directory=/u/pg/data' -D tmp start
>
> If you used --data_directory to start the server, you will need to use
> --data_directory to stop it, which seems reasonable.

Yep.  I think that when adjust_data_dir() changes pg_data it probably
needs to call canonicalize_path() on the new value.

> Patch attached.  This was much simpler than I thought.  :-)

Yes, this looks pretty simple.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> On m?n, 2011-10-03 at 18:44 -0400, Bruce Momjian wrote:
> > Agreed.  You could argue that pg_ctl 9.1 is much better than anything
> > anyone would be able to craft in a script.
> 
> And what facts support that argument?

Because pg_ctl 9.1 will read postmaster.pid and find the port number,
socket location, and listen host for wait mode --- I doubt someone would
do that work in a script.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Greg Stark
Date:
On Tue, Oct 4, 2011 at 2:42 PM, Bruce Momjian <bruce@momjian.us> wrote:
> Because pg_ctl 9.1 will read postmaster.pid and find the port number,
> socket location, and listen host for wait mode --- I doubt someone would
> do that work in a script.

But this is the whole difference between them. An init.d script
*shouldn't* do all that. It *knows* how the system daemon is
configured and should only be used to start and stop that process. And
it can't wait, it's not an interactive tool, it has to implement the
standard init.d interface.

An interactive tool can dwim automatically but that isn't appropriate
for a startup script. A startupt script should always do the same
thing exactly and do that based on the OS policy, not based on
inspecting what programs are actually running on the machine.

-- 
greg


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Robert Haas wrote:
> On Mon, Oct 3, 2011 at 11:04 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > OK, here is a patch that adds a -C option to the postmaster so any
> > config variable can be dumped, even while the server is running (there
> > is no security check because we don't have a user name at this point),
> > e.g.:
> >
> > ? ? ? ?postgres -D /pg_upgrade/tmp -C data_directory
> > ? ? ? ?/u/pg/data
>
> It seems both ugly and unnecessary to declare dump_config_variable as
> char[MAXPGPATH].  MAXPGPATH doesn't seem like the right length for a
> buffer intended to hold a GUC name, but in fact I don't think you need
> a buffer at all.  I think you can just declare it as char * and say
> dump_config_variable = optarg. getopt() doesn't overwrite the input
> argument vector, does it?

Well, as I remember, it writes a null byte at the end of the argument
and then passes the pointer to the start --- when it advances to the
next argument, it removes the null, so the pointer might still be valid,
but does not have proper termination (or maybe that's what strtok()
does).  However, I can find no documentation that mentions this
restriction, so perhaps it is old and no longer valid.

If you look in our code you will see tons of cases of:

    user = strdup(optarg);
    pg_data = xstrdup(optarg);
    my_opts->dbname = mystrdup(optarg);

However, I see other cases where we just assign optarg and optarg is a
string, e.g. pg_dump:

    username = optarg;

Doing a Google search shows both types of coding in random code pieces.

Are the optarg string duplication calls unnecessary and can be removed?
Either that, or we need to add some.

> Also, I think you should remove this comment:
>
> +         /* This allows anyone to read super-user config values. */
>
> It allows anyone to read super-user config values *who can already
> read postgresql.conf*.  Duh.

Oh, very good point --- I had not realized that.  Comment updated.

> > It also modifies pg_ctl to use this feature. ?It works fine for pg_ctl
> > -w start/stop with a config-only directory, so this is certainly in the
> > right direction. ?You can also use pg_ctl -o '--data_directory=/abc' and
> > it will be understood:
> >
> > ? ? ? ?pg_ctl -o '--data_directory=/u/pg/data' -D tmp start
> >
> > If you used --data_directory to start the server, you will need to use
> > --data_directory to stop it, which seems reasonable.
>
> Yep.  I think that when adjust_data_dir() changes pg_data it probably
> needs to call canonicalize_path() on the new value.

Done.

> > Patch attached. ?This was much simpler than I thought. ?:-)
>
> Yes, this looks pretty simple.

What really saved my bacon was the fact that the -o parameters are
passed into the backend and processed by the backend, rather than pg_ctl
having to read through that mess and parse it.  Doing that logic was
going to be very hard and unlikely to be back-patch-able.

Updated patch attached.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
new file mode 100644
index 0a84d97..122c206
*** a/src/backend/postmaster/postmaster.c
--- b/src/backend/postmaster/postmaster.c
*************** bool        enable_bonjour = false;
*** 203,208 ****
--- 203,210 ----
  char       *bonjour_name;
  bool        restart_after_crash = true;

+ char         *output_config_variable = NULL;
+
  /* PIDs of special child processes; 0 when not running */
  static pid_t StartupPID = 0,
              BgWriterPID = 0,
*************** PostmasterMain(int argc, char *argv[])
*** 537,543 ****
       * tcop/postgres.c (the option sets should not conflict) and with the
       * common help() function in main/main.c.
       */
!     while ((opt = getopt(argc, argv, "A:B:bc:D:d:EeFf:h:ijk:lN:nOo:Pp:r:S:sTt:W:-:")) != -1)
      {
          switch (opt)
          {
--- 539,545 ----
       * tcop/postgres.c (the option sets should not conflict) and with the
       * common help() function in main/main.c.
       */
!     while ((opt = getopt(argc, argv, "A:B:bc:C:D:d:EeFf:h:ijk:lN:nOo:Pp:r:S:sTt:W:-:")) != -1)
      {
          switch (opt)
          {
*************** PostmasterMain(int argc, char *argv[])
*** 554,559 ****
--- 556,565 ----
                  IsBinaryUpgrade = true;
                  break;

+             case 'C':
+                 output_config_variable = strdup(optarg);
+                 break;
+
              case 'D':
                  userDoption = optarg;
                  break;
*************** PostmasterMain(int argc, char *argv[])
*** 728,733 ****
--- 734,746 ----
      if (!SelectConfigFiles(userDoption, progname))
          ExitPostmaster(2);

+     if (output_config_variable != NULL)
+     {
+         /* permission is handled because the user is reading inside the data dir */
+         puts(GetConfigOption(output_config_variable, false, false));
+         ExitPostmaster(0);
+     }
+
      /* Verify that DataDir looks reasonable */
      checkDataDir();

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
new file mode 100644
index c7eac71..a5eae49
*** a/src/backend/tcop/postgres.c
--- b/src/backend/tcop/postgres.c
*************** process_postgres_switches(int argc, char
*** 3170,3176 ****
       * postmaster/postmaster.c (the option sets should not conflict) and with
       * the common help() function in main/main.c.
       */
!     while ((flag = getopt(argc, argv, "A:B:bc:D:d:EeFf:h:ijk:lN:nOo:Pp:r:S:sTt:v:W:-:")) != -1)
      {
          switch (flag)
          {
--- 3170,3176 ----
       * postmaster/postmaster.c (the option sets should not conflict) and with
       * the common help() function in main/main.c.
       */
!     while ((flag = getopt(argc, argv, "A:B:bc:C:D:d:EeFf:h:ijk:lN:nOo:Pp:r:S:sTt:v:W:-:")) != -1)
      {
          switch (flag)
          {
*************** process_postgres_switches(int argc, char
*** 3187,3192 ****
--- 3187,3196 ----
                  IsBinaryUpgrade = true;
                  break;

+             case 'C':
+                 /* ignored for consistency with postmaster */
+                 break;
+
              case 'D':
                  if (secure)
                      userDoption = strdup(optarg);
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
new file mode 100644
index 0dbdfe7..e633d0c
*** a/src/bin/pg_ctl/pg_ctl.c
--- b/src/bin/pg_ctl/pg_ctl.c
*************** static ShutdownMode shutdown_mode = SMAR
*** 81,86 ****
--- 81,87 ----
  static int    sig = SIGTERM;        /* default */
  static CtlCommand ctl_command = NO_COMMAND;
  static char *pg_data = NULL;
+ static char *pg_config = NULL;
  static char *pgdata_opt = NULL;
  static char *post_opts = NULL;
  static const char *progname;
*************** static void do_status(void);
*** 131,136 ****
--- 132,138 ----
  static void do_promote(void);
  static void do_kill(pgpid_t pid);
  static void print_msg(const char *msg);
+ static void adjust_data_dir(void);

  #if defined(WIN32) || defined(__CYGWIN__)
  static bool pgwin32_IsInstalled(SC_HANDLE);
*************** pgwin32_CommandLine(bool registration)
*** 1265,1274 ****
          strcat(cmdLine, "\"");
      }

!     if (pg_data)
      {
          strcat(cmdLine, " -D \"");
!         strcat(cmdLine, pg_data);
          strcat(cmdLine, "\"");
      }

--- 1267,1276 ----
          strcat(cmdLine, "\"");
      }

!     if (pg_config)
      {
          strcat(cmdLine, " -D \"");
!         strcat(cmdLine, pg_config);
          strcat(cmdLine, "\"");
      }

*************** set_starttype(char *starttypeopt)
*** 1886,1891 ****
--- 1888,1946 ----
  }
  #endif

+ /*
+  * adjust_data_dir
+  *
+  * If a configuration-only directory was specified, find the real data dir.
+  */
+ void
+ adjust_data_dir(void)
+ {
+     char        cmd[MAXPGPATH], filename[MAXPGPATH], *my_exec_path;
+     FILE       *fd;
+
+     /* If there is no postgresql.conf, it can't be a config-only dir */
+     snprintf(filename, sizeof(filename), "%s/postgresql.conf", pg_config);
+     if ((fd = fopen(filename, "r")) == NULL)
+         return;
+     fclose(fd);
+
+     /* If PG_VERSION exists, it can't be a config-only dir */
+     snprintf(filename, sizeof(filename), "%s/PG_VERSION", pg_config);
+     if ((fd = fopen(filename, "r")) != NULL)
+     {
+         fclose(fd);
+         return;
+     }
+
+     /* Must be a configuration directory, so find the data directory */
+
+     /* we use a private my_exec_path to avoid interfering with later uses */
+     if (exec_path == NULL)
+         my_exec_path = find_other_exec_or_die(argv0, "postgres", PG_BACKEND_VERSIONSTR);
+     else
+         my_exec_path = xstrdup(exec_path);
+
+     snprintf(cmd, MAXPGPATH, SYSTEMQUOTE "\"%s\" %s%s -C data_directory" SYSTEMQUOTE,
+              my_exec_path, pgdata_opt ? pgdata_opt : "", post_opts ?
+              post_opts : "");
+
+     fd = popen(cmd, "r");
+     if (fd == NULL || fgets(filename, sizeof(filename), fd) == NULL)
+     {
+         write_stderr(_("%s: cannot find the data directory using %s\n"), progname, my_exec_path);
+         exit(1);
+     }
+     pclose(fd);
+     free(my_exec_path);
+
+     if (strlen(filename) > 0 && filename[strlen(filename) - 1] == '\n')
+         filename[strlen(filename) - 1] = '\0';
+     free(pg_data);
+     pg_data = xstrdup(filename);
+     canonicalize_path(pg_data);
+ }
+

  int
  main(int argc, char **argv)
*************** main(int argc, char **argv)
*** 2120,2133 ****
      }

      /* Note we put any -D switch into the env var above */
!     pg_data = getenv("PGDATA");
!     if (pg_data)
      {
!         pg_data = xstrdup(pg_data);
!         canonicalize_path(pg_data);
      }

!     if (pg_data == NULL &&
          ctl_command != KILL_COMMAND && ctl_command != UNREGISTER_COMMAND)
      {
          write_stderr(_("%s: no database directory specified and environment variable PGDATA unset\n"),
--- 2175,2191 ----
      }

      /* Note we put any -D switch into the env var above */
!     pg_config = getenv("PGDATA");
!     if (pg_config)
      {
!         pg_config = xstrdup(pg_config);
!         canonicalize_path(pg_config);
!         pg_data = xstrdup(pg_config);
      }

!     adjust_data_dir();
!
!     if (pg_config == NULL &&
          ctl_command != KILL_COMMAND && ctl_command != UNREGISTER_COMMAND)
      {
          write_stderr(_("%s: no database directory specified and environment variable PGDATA unset\n"),

Re: Bug with pg_ctl -w/wait and config-only directories

From
Robert Haas
Date:
On Tue, Oct 4, 2011 at 10:55 AM, Bruce Momjian <bruce@momjian.us> wrote:
>> It seems both ugly and unnecessary to declare dump_config_variable as
>> char[MAXPGPATH].  MAXPGPATH doesn't seem like the right length for a
>> buffer intended to hold a GUC name, but in fact I don't think you need
>> a buffer at all.  I think you can just declare it as char * and say
>> dump_config_variable = optarg. getopt() doesn't overwrite the input
>> argument vector, does it?
>
> Well, as I remember, it writes a null byte at the end of the argument
> and then passes the pointer to the start --- when it advances to the
> next argument, it removes the null, so the pointer might still be valid,
> but does not have proper termination (or maybe that's what strtok()
> does).  However, I can find no documentation that mentions this
> restriction, so perhaps it is old and no longer valid.
>
> If you look in our code you will see tons of cases of:
>
>        user = strdup(optarg);
>        pg_data = xstrdup(optarg);
>        my_opts->dbname = mystrdup(optarg);
>
> However, I see other cases where we just assign optarg and optarg is a
> string, e.g. pg_dump:
>
>        username = optarg;
>
> Doing a Google search shows both types of coding in random code pieces.
>
> Are the optarg string duplication calls unnecessary and can be removed?
> Either that, or we need to add some.

Well, if you want to keep it, I'd suggest using malloc() to get an
appropriate size buffer (not palloc) rather than using some arbitrary
constant for the length.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Robert Haas wrote:
> On Tue, Oct 4, 2011 at 10:55 AM, Bruce Momjian <bruce@momjian.us> wrote:
> >> It seems both ugly and unnecessary to declare dump_config_variable as
> >> char[MAXPGPATH]. ?MAXPGPATH doesn't seem like the right length for a
> >> buffer intended to hold a GUC name, but in fact I don't think you need
> >> a buffer at all. ?I think you can just declare it as char * and say
> >> dump_config_variable = optarg. getopt() doesn't overwrite the input
> >> argument vector, does it?
> >
> > Well, as I remember, it writes a null byte at the end of the argument
> > and then passes the pointer to the start --- when it advances to the
> > next argument, it removes the null, so the pointer might still be valid,
> > but does not have proper termination (or maybe that's what strtok()
> > does). ?However, I can find no documentation that mentions this
> > restriction, so perhaps it is old and no longer valid.
> >
> > If you look in our code you will see tons of cases of:
> >
> > ? ? ? ?user = strdup(optarg);
> > ? ? ? ?pg_data = xstrdup(optarg);
> > ? ? ? ?my_opts->dbname = mystrdup(optarg);
> >
> > However, I see other cases where we just assign optarg and optarg is a
> > string, e.g. pg_dump:
> >
> > ? ? ? ?username = optarg;
> >
> > Doing a Google search shows both types of coding in random code pieces.
> >
> > Are the optarg string duplication calls unnecessary and can be removed?
> > Either that, or we need to add some.
> 
> Well, if you want to keep it, I'd suggest using malloc() to get an
> appropriate size buffer (not palloc) rather than using some arbitrary
> constant for the length.

The new code does strdup(), which will match what is passed.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
jreidthompson
Date:
.
Alvaro Herrera-7 wrote:
> 
>  I dunno what about Gentoo.
> -
some info about gentoo

http://pastebin.com/9hbLmVJA

http://www.gentoo.org/doc/en/postgres-howto.xml

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Bug-with-pg-ctl-w-wait-and-config-only-directories-tp4860202p4868931.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


Re: Bug with pg_ctl -w/wait and config-only directories

From
"Mr. Aaron W. Swenson"
Date:
On Tue, Oct 04, 2011 at 09:42:42AM -0400, Bruce Momjian wrote:
> Peter Eisentraut wrote:
> > On m?n, 2011-10-03 at 18:44 -0400, Bruce Momjian wrote:
> > > Agreed.  You could argue that pg_ctl 9.1 is much better than anything
> > > anyone would be able to craft in a script.
> >
> > And what facts support that argument?
>
> Because pg_ctl 9.1 will read postmaster.pid and find the port number,
> socket location, and listen host for wait mode --- I doubt someone would
> do that work in a script.

I don't know why we chose pg_ctl, but I chose to stick with pg_ctl over
rolling my own precisely because of that. pg_ctl does all sorts of fancy
things that would have taken me a much longer time than figuring out a way
to allow a configuration-only directory and a data-only directory.

--
Mr. Aaron W. Swenson
Gentoo Linux Developer
Email    : titanofold@gentoo.org
GnuPG FP : 2C00 7719 4F85 FB07 A49C  0E31 5713 AA03 D1BB FDA0
GnuPG ID : D1BBFDA0

Re: Bug with pg_ctl -w/wait and config-only directories

From
"Mr. Aaron W. Swenson"
Date:
On Mon, Oct 03, 2011 at 04:49:21PM -0300, Alvaro Herrera wrote:
>
> Excerpts from Bruce Momjian's message of lun oct 03 16:09:08 -0300 2011:
>
> > Yes, auto-creation of symlinks would be useful, but at that point pg_ctl
> > and pg_upgrade would have to use the real data directory, so I again
> > wonder what the config-only directory is getting us.
>
> Not mixing config stuff (in /etc per FHS) with server data (/var/lib,
> again per FHS).  It's Debian policy anyway.  I don't judge whether this
> is sane or not.  See
> http://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard
>
> > Why were people not using pg_ctl?  Because of the limitations which were
> > fixed in PG 9.1?  As Dave already said, windows already has to use pg_ctl.
>
> As I said, Debian has their own version pg_ctlcluster because of their
> work to allow multiple major versions to work simultaneously in the same
> server.  I dunno what about Gentoo.

I implemented separated configuration and data directories to adhere to
FHS. Actually, I had a bug on bugs.gentoo.org -- again, actually, there
have been a few bugs over the years -- requesting that I make PostgreSQL
adhere to the FHS.

I made it work using pg_ctl. The more curious among you can take a look at
the initscripts and related config files from my devspace. [1]

'/etc/init.d/postgresql-9.0 restart' will actually call stop() and
start(). Otherwise, I've mostly paralleled initscript functionality with
the functionality of pg_ctl. Multiple initscripts are installed
side-by-side to control multiple major versions.

1. http://dev.gentoo.org/~titanofold/

--
Mr. Aaron W. Swenson
Gentoo Linux Developer
Email    : titanofold@gentoo.org
GnuPG FP : 2C00 7719 4F85 FB07 A49C  0E31 5713 AA03 D1BB FDA0
GnuPG ID : D1BBFDA0

Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Greg Stark wrote:
> On Tue, Oct 4, 2011 at 2:42 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > Because pg_ctl 9.1 will read postmaster.pid and find the port number,
> > socket location, and listen host for wait mode --- I doubt someone would
> > do that work in a script.
> 
> But this is the whole difference between them. An init.d script
> *shouldn't* do all that. It *knows* how the system daemon is
> configured and should only be used to start and stop that process. And
> it can't wait, it's not an interactive tool, it has to implement the
> standard init.d interface.
> 
> An interactive tool can dwim automatically but that isn't appropriate
> for a startup script. A startupt script should always do the same
> thing exactly and do that based on the OS policy, not based on
> inspecting what programs are actually running on the machine.

I agree, except the Gentoo script does exactly that --- wait for
completion using pg_ctl -w.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Greg Stark wrote:
>> An interactive tool can dwim automatically but that isn't appropriate
>> for a startup script. A startupt script should always do the same
>> thing exactly and do that based on the OS policy, not based on
>> inspecting what programs are actually running on the machine.

> I agree, except the Gentoo script does exactly that --- wait for
> completion using pg_ctl -w.

As of fairly recently, the Fedora package also uses pg_ctl for both
starting and stopping.  We've fixed all the reasons that formerly
existed to avoid use of pg_ctl, and it's a real PITA to try to
implement the waiting logic at shell level.
        regards, tom lane


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Greg Stark wrote:
> >> An interactive tool can dwim automatically but that isn't appropriate
> >> for a startup script. A startupt script should always do the same
> >> thing exactly and do that based on the OS policy, not based on
> >> inspecting what programs are actually running on the machine.
> 
> > I agree, except the Gentoo script does exactly that --- wait for
> > completion using pg_ctl -w.
> 
> As of fairly recently, the Fedora package also uses pg_ctl for both
> starting and stopping.  We've fixed all the reasons that formerly
> existed to avoid use of pg_ctl, and it's a real PITA to try to
> implement the waiting logic at shell level.

OK, that's a good use-case to warrant barrelling ahead with improving
pg_ctl for config-only installs.  What releases do we want to apply that
patch to allow postgres to dump config values and have pg_ctl use them? 
This patch is required for old/new installs for pg_upgrade to work.

Once we decide that, I will work on the pg_upgrade code to use this as
well.  pg_upgrade will use the new pg_ctl but it also needs to find the
data directory via the postgres binary.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Tom Lane wrote:
>> As of fairly recently, the Fedora package also uses pg_ctl for both
>> starting and stopping.  We've fixed all the reasons that formerly
>> existed to avoid use of pg_ctl, and it's a real PITA to try to
>> implement the waiting logic at shell level.

> OK, that's a good use-case to warrant barrelling ahead with improving
> pg_ctl for config-only installs.

Did I say that?  The Fedora packages don't support config-only
arrangements (if you tried, SELinux would likely deny access to whatever
directory you chose ...)

> What releases do we want to apply that
> patch to allow postgres to dump config values and have pg_ctl use them? 
> This patch is required for old/new installs for pg_upgrade to work.

I still think this is a matter for HEAD only.  We haven't supported
these cases in back branches and so there is little argument for
back-patching.
        regards, tom lane


Re: Bug with pg_ctl -w/wait and config-only directories

From
Peter Eisentraut
Date:
On tis, 2011-10-04 at 17:49 -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Greg Stark wrote:
> >> An interactive tool can dwim automatically but that isn't appropriate
> >> for a startup script. A startupt script should always do the same
> >> thing exactly and do that based on the OS policy, not based on
> >> inspecting what programs are actually running on the machine.
> 
> > I agree, except the Gentoo script does exactly that --- wait for
> > completion using pg_ctl -w.
> 
> As of fairly recently, the Fedora package also uses pg_ctl for both
> starting and stopping.  We've fixed all the reasons that formerly
> existed to avoid use of pg_ctl, and it's a real PITA to try to
> implement the waiting logic at shell level.

Well, it's debatable whether an init script should actually do any
waiting.  I'm not saying that what you are doing is wrong, but it
depends on local policy and conventions.  I maintain some unrelated init
scripts in Debian and have gotten occasional hell from users for holding
up the boot process even a bit while waiting for the service to become
fully operational.  A restart of a failing PostgreSQL server can take
minutes; I don't want to think about how that would be received. :-/
Considering how much work people are putting into speeding up the boot
process in Linux distributions at the moment, with upstart, systemd
etc., it's not clear to me that the waiting feature is a required
behavior.




Re: Bug with pg_ctl -w/wait and config-only directories

From
Peter Eisentraut
Date:
On mån, 2011-10-03 at 15:09 -0400, Bruce Momjian wrote:
> Why were people not using pg_ctl?

Actually, a slight correction/addition here: The Debian init script does
use pg_ctl to start the service.  Seems to work fine.



Re: Bug with pg_ctl -w/wait and config-only directories

From
Robert Haas
Date:
On Tue, Oct 4, 2011 at 8:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I still think this is a matter for HEAD only.  We haven't supported
> these cases in back branches and so there is little argument for
> back-patching.

According to Bruce's original post, there is at least one 9.1
regression here relative to 9.0:

>> What is even worse is that pre-9.1, pg_ctl start would read ports from
>> the pg_ctl -o command line, but in 9.1 we changed this to force reading
>> the postmaster.pid file to find the port number and socket directory
>> location --- meaning, new in PG 9.1, 'pg_ctl -w start' doesn't work for
>> config-only directories either.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Tom Lane wrote:
> >> As of fairly recently, the Fedora package also uses pg_ctl for both
> >> starting and stopping.  We've fixed all the reasons that formerly
> >> existed to avoid use of pg_ctl, and it's a real PITA to try to
> >> implement the waiting logic at shell level.
> 
> > OK, that's a good use-case to warrant barrelling ahead with improving
> > pg_ctl for config-only installs.
> 
> Did I say that?  The Fedora packages don't support config-only
> arrangements (if you tried, SELinux would likely deny access to whatever
> directory you chose ...)
> 
> > What releases do we want to apply that
> > patch to allow postgres to dump config values and have pg_ctl use them? 
> > This patch is required for old/new installs for pg_upgrade to work.
> 
> I still think this is a matter for HEAD only.  We haven't supported
> these cases in back branches and so there is little argument for
> back-patching.

OK.  Just be aware that limits pg_upgrade to only upgrading old
config-only installs of PG 9.2+, e.g. old PG 9.2 to new PG 9.3.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> On m?n, 2011-10-03 at 15:09 -0400, Bruce Momjian wrote:
> > Why were people not using pg_ctl?
> 
> Actually, a slight correction/addition here: The Debian init script does
> use pg_ctl to start the service.  Seems to work fine.

Yes.  The script authors discovered a working behavior, which matches my
research:
       pg_ctl start            specify config directory       pg_ctl -w start         impossible       pg_ctl restart
      impossible       pg_ctl stop             specify real data dir       pg_ctl -w stop          specify real data
dir      pg_ctl reload           specify real data dir
 

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> On tis, 2011-10-04 at 17:49 -0400, Tom Lane wrote:
>> As of fairly recently, the Fedora package also uses pg_ctl for both
>> starting and stopping.  We've fixed all the reasons that formerly
>> existed to avoid use of pg_ctl, and it's a real PITA to try to
>> implement the waiting logic at shell level.

> Well, it's debatable whether an init script should actually do any
> waiting.  I'm not saying that what you are doing is wrong, but it
> depends on local policy and conventions.  I maintain some unrelated init
> scripts in Debian and have gotten occasional hell from users for holding
> up the boot process even a bit while waiting for the service to become
> fully operational.  A restart of a failing PostgreSQL server can take
> minutes; I don't want to think about how that would be received. :-/
> Considering how much work people are putting into speeding up the boot
> process in Linux distributions at the moment, with upstart, systemd
> etc., it's not clear to me that the waiting feature is a required
> behavior.

Well, actually, it wasn't until Fedora went to systemd that I could
sanely use "pg_ctl start -w".  In SysV initscripts, you're right,
waiting indefinitely for the DB server to come up is not tenable.
But in systemd, there is no serialization of services and it's better
if systemd is aware that the service isn't fully started yet.
In particular, with this implementation, somebody can put "After:
postgresql.service" in their unit file and be sure that the DB will be
ready when their service starts.
        regards, tom lane


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Robert Haas wrote:
> On Tue, Oct 4, 2011 at 8:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I still think this is a matter for HEAD only. ?We haven't supported
> > these cases in back branches and so there is little argument for
> > back-patching.
> 
> According to Bruce's original post, there is at least one 9.1
> regression here relative to 9.0:
> 
> >> What is even worse is that pre-9.1, pg_ctl start would read ports from
> >> the pg_ctl -o command line, but in 9.1 we changed this to force reading
> >> the postmaster.pid file to find the port number and socket directory
> >> location --- meaning, new in PG 9.1, 'pg_ctl -w start' doesn't work for
> >> config-only directories either.

Yes, PG 9.1 pg_ctl -w does this:
$ pg_ctl -w -D tmp startwaiting for server to start....LOG:  could not open usermap
file"/usr/var/local/pgdev/pgfoundry/pg_migrator/pg_migrator/tmp/pg_ident.conf":Nosuch file or directoryLOG:  database
systemwas shut down at 2011-10-05 10:53:09 EDTLOG:  database system is ready to accept connectionsLOG:  autovacuum
launcherstarted.... stopped waitingpg_ctl: could not start serverExamine the log output.
 

This used to work.  Of course, there are many non-standard setting that
didn't work in pre-9.1.

I think the real question is whether this issue or allowing pg_upgrade
to work with and old pre-9.2 is worth the backpatching risk.  I am
unsure myself.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Bruce Momjian wrote:
> Robert Haas wrote:
> > On Mon, Oct 3, 2011 at 11:04 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > > OK, here is a patch that adds a -C option to the postmaster so any
> > > config variable can be dumped, even while the server is running (there
> > > is no security check because we don't have a user name at this point),

OK, here is an updated patch, with documentation, that I consider ready
for commit to git head.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/ref/postgres-ref.sgml b/doc/src/sgml/ref/postgres-ref.sgml
new file mode 100644
index b16bbdf..3807f40
*** a/doc/src/sgml/ref/postgres-ref.sgml
--- b/doc/src/sgml/ref/postgres-ref.sgml
*************** PostgreSQL documentation
*** 141,146 ****
--- 141,160 ----
       </varlistentry>

       <varlistentry>
+       <term><option>-C <replaceable>name</replaceable></option></term>
+       <listitem>
+        <para>
+         Returns the value of a named run-time parameter, and exits.
+         (See the <option>-c</> option above for details.)  This can
+         be used on a running server, and returns values from
+         <filename>postgresql.conf</>, modified by any parameters
+         supplied in this invocation.  It does not reflect parameters
+         supplied when the cluster was started.
+        </para>
+       </listitem>
+      </varlistentry>
+
+      <varlistentry>
        <term><option>-d <replaceable>debug-level</replaceable></option></term>
        <listitem>
         <para>
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
new file mode 100644
index 0a84d97..dd7493c
*** a/src/backend/postmaster/postmaster.c
--- b/src/backend/postmaster/postmaster.c
*************** bool        enable_bonjour = false;
*** 203,208 ****
--- 203,210 ----
  char       *bonjour_name;
  bool        restart_after_crash = true;

+ char         *output_config_variable = NULL;
+
  /* PIDs of special child processes; 0 when not running */
  static pid_t StartupPID = 0,
              BgWriterPID = 0,
*************** PostmasterMain(int argc, char *argv[])
*** 537,543 ****
       * tcop/postgres.c (the option sets should not conflict) and with the
       * common help() function in main/main.c.
       */
!     while ((opt = getopt(argc, argv, "A:B:bc:D:d:EeFf:h:ijk:lN:nOo:Pp:r:S:sTt:W:-:")) != -1)
      {
          switch (opt)
          {
--- 539,545 ----
       * tcop/postgres.c (the option sets should not conflict) and with the
       * common help() function in main/main.c.
       */
!     while ((opt = getopt(argc, argv, "A:B:bc:C:D:d:EeFf:h:ijk:lN:nOo:Pp:r:S:sTt:W:-:")) != -1)
      {
          switch (opt)
          {
*************** PostmasterMain(int argc, char *argv[])
*** 554,559 ****
--- 556,565 ----
                  IsBinaryUpgrade = true;
                  break;

+             case 'C':
+                 output_config_variable = optarg;
+                 break;
+
              case 'D':
                  userDoption = optarg;
                  break;
*************** PostmasterMain(int argc, char *argv[])
*** 728,733 ****
--- 734,746 ----
      if (!SelectConfigFiles(userDoption, progname))
          ExitPostmaster(2);

+     if (output_config_variable != NULL)
+     {
+         /* permission is handled because the user is reading inside the data dir */
+         puts(GetConfigOption(output_config_variable, false, false));
+         ExitPostmaster(0);
+     }
+
      /* Verify that DataDir looks reasonable */
      checkDataDir();

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
new file mode 100644
index c7eac71..19d94b2
*** a/src/backend/tcop/postgres.c
--- b/src/backend/tcop/postgres.c
*************** process_postgres_switches(int argc, char
*** 3170,3176 ****
       * postmaster/postmaster.c (the option sets should not conflict) and with
       * the common help() function in main/main.c.
       */
!     while ((flag = getopt(argc, argv, "A:B:bc:D:d:EeFf:h:ijk:lN:nOo:Pp:r:S:sTt:v:W:-:")) != -1)
      {
          switch (flag)
          {
--- 3170,3176 ----
       * postmaster/postmaster.c (the option sets should not conflict) and with
       * the common help() function in main/main.c.
       */
!     while ((flag = getopt(argc, argv, "A:B:bc:C:D:d:EeFf:h:ijk:lN:nOo:Pp:r:S:sTt:v:W:-:")) != -1)
      {
          switch (flag)
          {
*************** process_postgres_switches(int argc, char
*** 3187,3192 ****
--- 3187,3196 ----
                  IsBinaryUpgrade = true;
                  break;

+             case 'C':
+                 /* ignored for consistency with the postmaster */
+                 break;
+
              case 'D':
                  if (secure)
                      userDoption = strdup(optarg);
*************** process_postgres_switches(int argc, char
*** 3272,3278 ****
                  break;

              case 'T':
!                 /* ignored for consistency with postmaster */
                  break;

              case 't':
--- 3276,3282 ----
                  break;

              case 'T':
!                 /* ignored for consistency with the postmaster */
                  break;

              case 't':
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
new file mode 100644
index 0dbdfe7..e633d0c
*** a/src/bin/pg_ctl/pg_ctl.c
--- b/src/bin/pg_ctl/pg_ctl.c
*************** static ShutdownMode shutdown_mode = SMAR
*** 81,86 ****
--- 81,87 ----
  static int    sig = SIGTERM;        /* default */
  static CtlCommand ctl_command = NO_COMMAND;
  static char *pg_data = NULL;
+ static char *pg_config = NULL;
  static char *pgdata_opt = NULL;
  static char *post_opts = NULL;
  static const char *progname;
*************** static void do_status(void);
*** 131,136 ****
--- 132,138 ----
  static void do_promote(void);
  static void do_kill(pgpid_t pid);
  static void print_msg(const char *msg);
+ static void adjust_data_dir(void);

  #if defined(WIN32) || defined(__CYGWIN__)
  static bool pgwin32_IsInstalled(SC_HANDLE);
*************** pgwin32_CommandLine(bool registration)
*** 1265,1274 ****
          strcat(cmdLine, "\"");
      }

!     if (pg_data)
      {
          strcat(cmdLine, " -D \"");
!         strcat(cmdLine, pg_data);
          strcat(cmdLine, "\"");
      }

--- 1267,1276 ----
          strcat(cmdLine, "\"");
      }

!     if (pg_config)
      {
          strcat(cmdLine, " -D \"");
!         strcat(cmdLine, pg_config);
          strcat(cmdLine, "\"");
      }

*************** set_starttype(char *starttypeopt)
*** 1886,1891 ****
--- 1888,1946 ----
  }
  #endif

+ /*
+  * adjust_data_dir
+  *
+  * If a configuration-only directory was specified, find the real data dir.
+  */
+ void
+ adjust_data_dir(void)
+ {
+     char        cmd[MAXPGPATH], filename[MAXPGPATH], *my_exec_path;
+     FILE       *fd;
+
+     /* If there is no postgresql.conf, it can't be a config-only dir */
+     snprintf(filename, sizeof(filename), "%s/postgresql.conf", pg_config);
+     if ((fd = fopen(filename, "r")) == NULL)
+         return;
+     fclose(fd);
+
+     /* If PG_VERSION exists, it can't be a config-only dir */
+     snprintf(filename, sizeof(filename), "%s/PG_VERSION", pg_config);
+     if ((fd = fopen(filename, "r")) != NULL)
+     {
+         fclose(fd);
+         return;
+     }
+
+     /* Must be a configuration directory, so find the data directory */
+
+     /* we use a private my_exec_path to avoid interfering with later uses */
+     if (exec_path == NULL)
+         my_exec_path = find_other_exec_or_die(argv0, "postgres", PG_BACKEND_VERSIONSTR);
+     else
+         my_exec_path = xstrdup(exec_path);
+
+     snprintf(cmd, MAXPGPATH, SYSTEMQUOTE "\"%s\" %s%s -C data_directory" SYSTEMQUOTE,
+              my_exec_path, pgdata_opt ? pgdata_opt : "", post_opts ?
+              post_opts : "");
+
+     fd = popen(cmd, "r");
+     if (fd == NULL || fgets(filename, sizeof(filename), fd) == NULL)
+     {
+         write_stderr(_("%s: cannot find the data directory using %s\n"), progname, my_exec_path);
+         exit(1);
+     }
+     pclose(fd);
+     free(my_exec_path);
+
+     if (strlen(filename) > 0 && filename[strlen(filename) - 1] == '\n')
+         filename[strlen(filename) - 1] = '\0';
+     free(pg_data);
+     pg_data = xstrdup(filename);
+     canonicalize_path(pg_data);
+ }
+

  int
  main(int argc, char **argv)
*************** main(int argc, char **argv)
*** 2120,2133 ****
      }

      /* Note we put any -D switch into the env var above */
!     pg_data = getenv("PGDATA");
!     if (pg_data)
      {
!         pg_data = xstrdup(pg_data);
!         canonicalize_path(pg_data);
      }

!     if (pg_data == NULL &&
          ctl_command != KILL_COMMAND && ctl_command != UNREGISTER_COMMAND)
      {
          write_stderr(_("%s: no database directory specified and environment variable PGDATA unset\n"),
--- 2175,2191 ----
      }

      /* Note we put any -D switch into the env var above */
!     pg_config = getenv("PGDATA");
!     if (pg_config)
      {
!         pg_config = xstrdup(pg_config);
!         canonicalize_path(pg_config);
!         pg_data = xstrdup(pg_config);
      }

!     adjust_data_dir();
!
!     if (pg_config == NULL &&
          ctl_command != KILL_COMMAND && ctl_command != UNREGISTER_COMMAND)
      {
          write_stderr(_("%s: no database directory specified and environment variable PGDATA unset\n"),

Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Bruce Momjian wrote:
> Bruce Momjian wrote:
> > Robert Haas wrote:
> > > On Mon, Oct 3, 2011 at 11:04 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > > > OK, here is a patch that adds a -C option to the postmaster so any
> > > > config variable can be dumped, even while the server is running (there
> > > > is no security check because we don't have a user name at this point),
>
> OK, here is an updated patch, with documentation, that I consider ready
> for commit to git head.

And with a --help documentation patch.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +
commit c599ef7e91d4a28fe59b7ab4d84666bb01376570
Author: Bruce Momjian <bruce@momjian.us>
Date:   Wed Oct 5 11:18:26 2011 -0400

    Add --help output.

diff --git a/src/backend/main/main.c b/src/backend/main/main.c
new file mode 100644
index 8d9cb94..b14c731
*** a/src/backend/main/main.c
--- b/src/backend/main/main.c
*************** help(const char *progname)
*** 283,288 ****
--- 283,289 ----
  #endif
      printf(_("  -B NBUFFERS     number of shared buffers\n"));
      printf(_("  -c NAME=VALUE   set run-time parameter\n"));
+     printf(_("  -C NAME         return run-time parameter\n"));
      printf(_("  -d 1-5          debugging level\n"));
      printf(_("  -D DATADIR      database directory\n"));
      printf(_("  -e              use European date input format (DMY)\n"));

Re: Bug with pg_ctl -w/wait and config-only directories

From
"Mr. Aaron W. Swenson"
Date:
On Wed, Oct 05, 2011 at 10:44:38AM -0400, Bruce Momjian wrote:
> Peter Eisentraut wrote:
> > On m?n, 2011-10-03 at 15:09 -0400, Bruce Momjian wrote:
> > > Why were people not using pg_ctl?
> >
> > Actually, a slight correction/addition here: The Debian init script does
> > use pg_ctl to start the service.  Seems to work fine.
>
> Yes.  The script authors discovered a working behavior, which matches my
> research:
>
>         pg_ctl start            specify config directory
>         pg_ctl -w start         impossible
> ...


Maybe I'm misunderstanding what you've written, but for 'pg_ctl -w' it is
possible. The following command does work (I've replaced the variables
with their default values to make it easier to read):
 su -l postgres \   -c "env PGPORT=\"5432\" /usr/lib/postgresql-9.1/bin/pg_ctl start -w \       -t 60 -s -D
/var/lib/postgresql/9.1/data/\        -o '-D /etc/postgresql-9.1/ \
--data-directory=/var/lib/postgresql/9.1/data/\           --silent-mode=true'" 

--
Mr. Aaron W. Swenson
Gentoo Linux Developer
Email    : titanofold@gentoo.org
GnuPG FP : 2C00 7719 4F85 FB07 A49C  0E31 5713 AA03 D1BB FDA0
GnuPG ID : D1BBFDA0

Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Mr. Aaron W. Swenson wrote:
-- Start of PGP signed section.
> On Wed, Oct 05, 2011 at 10:44:38AM -0400, Bruce Momjian wrote:
> > Peter Eisentraut wrote:
> > > On m?n, 2011-10-03 at 15:09 -0400, Bruce Momjian wrote:
> > > > Why were people not using pg_ctl?
> > > 
> > > Actually, a slight correction/addition here: The Debian init script does
> > > use pg_ctl to start the service.  Seems to work fine.
> > 
> > Yes.  The script authors discovered a working behavior, which matches my
> > research:
> > 
> >         pg_ctl start            specify config directory
> >         pg_ctl -w start         impossible
> > ...
> 
> 
> Maybe I'm misunderstanding what you've written, but for 'pg_ctl -w' it is
> possible. The following command does work (I've replaced the variables
> with their default values to make it easier to read):
> 
>   su -l postgres \
>     -c "env PGPORT=\"5432\" /usr/lib/postgresql-9.1/bin/pg_ctl start -w \
>         -t 60 -s -D /var/lib/postgresql/9.1/data/ \
>          -o '-D /etc/postgresql-9.1/ \
>             --data-directory=/var/lib/postgresql/9.1/data/ \
>             --silent-mode=true'"

Wow.  So you are telling pg_ctl to use the real data directory, but are
passing flags to the postmaster to start with the config directory and
also setting data-directory to the real data directory.  I am impressed.

I can see how that would work.  Did you look at the C code to figure
this out or did you just test until it worked, or did it just seem
logical to you?

With the patch I am going to commit, you will not need to use one of the
-D flags because pg_ctl will find the data directory location;  you will
just specify the config-only directory with one -D, and the
--data-directory.

pg_upgrade could use this method if it had -o options for old and new
clusters.  Right now it has -p and -P and those could be removed and use
-o/-O instead, but it would require parsing the -o string, with is
problematic.  The right solution for the port number would be to use the
new postmaster -C option, pass the -o into that, and read the port
number with 'postmaster -C port'.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
"Mr. Aaron W. Swenson"
Date:
On Wed, Oct 05, 2011 at 07:20:16PM -0400, Bruce Momjian wrote:
> Mr. Aaron W. Swenson wrote:
> -- Start of PGP signed section.
> > On Wed, Oct 05, 2011 at 10:44:38AM -0400, Bruce Momjian wrote:
> > > Peter Eisentraut wrote:
> > > > On m?n, 2011-10-03 at 15:09 -0400, Bruce Momjian wrote:
> > > > > Why were people not using pg_ctl?
> > > >
> > > > Actually, a slight correction/addition here: The Debian init script does
> > > > use pg_ctl to start the service.  Seems to work fine.
> > >
> > > Yes.  The script authors discovered a working behavior, which matches my
> > > research:
> > >
> > >         pg_ctl start            specify config directory
> > >         pg_ctl -w start         impossible
> > > ...
> >
> >
> > Maybe I'm misunderstanding what you've written, but for 'pg_ctl -w' it is
> > possible. The following command does work (I've replaced the variables
> > with their default values to make it easier to read):
> >
> >   su -l postgres \
> >     -c "env PGPORT=\"5432\" /usr/lib/postgresql-9.1/bin/pg_ctl start -w \
> >         -t 60 -s -D /var/lib/postgresql/9.1/data/ \
> >          -o '-D /etc/postgresql-9.1/ \
> >             --data-directory=/var/lib/postgresql/9.1/data/ \
> >             --silent-mode=true'"
>
> Wow.  So you are telling pg_ctl to use the real data directory, but are
> passing flags to the postmaster to start with the config directory and
> also setting data-directory to the real data directory.  I am impressed.
>
> I can see how that would work.  Did you look at the C code to figure
> this out or did you just test until it worked, or did it just seem
> logical to you?

No, I didn't look into the C code. (Though, I have poked around in there a
bit.)

Being a Gentoo user has taught me a few things like one should always
'read the friendly manual.' It just seemed logical after reading the
pg_ctl documentation that passing options to the postmaster would do the
trick, which spurred my testing several variations until it worked. But, I
only knew that I could give 'postgres'/'postmaster' the configuration
directory and it would do the right thing.
> With the patch I am going to commit, you will not need to use one of the
> -D flags because pg_ctl will find the data directory location;  you will
> just specify the config-only directory with one -D, and the
> --data-directory.

So, you're saying that my passing the config directory to pg_ctl through
'-D' will eliminate the need -- as far as config and data directories are
concerned -- to pass options directly to the postmaster?

--
Mr. Aaron W. Swenson
Gentoo Linux Developer
Email    : titanofold@gentoo.org
GnuPG FP : 2C00 7719 4F85 FB07 A49C  0E31 5713 AA03 D1BB FDA0
GnuPG ID : D1BBFDA0

Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Mr. Aaron W. Swenson wrote:
> > With the patch I am going to commit, you will not need to use one of the
> > -D flags because pg_ctl will find the data directory location;  you will
> > just specify the config-only directory with one -D, and the
> > --data-directory.
> 
> So, you're saying that my passing the config directory to pg_ctl through
> '-D' will eliminate the need -- as far as config and data directories are
> concerned -- to pass options directly to the postmaster?

Yes, in PG 9.2.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Bug with pg_ctl -w/wait and config-only directories

From
"Mr. Aaron W. Swenson"
Date:
On Wed, Oct 05, 2011 at 07:59:07PM -0400, Bruce Momjian wrote:
> Mr. Aaron W. Swenson wrote:
> > > With the patch I am going to commit, you will not need to use one of the
> > > -D flags because pg_ctl will find the data directory location;  you will
> > > just specify the config-only directory with one -D, and the
> > > --data-directory.
> >
> > So, you're saying that my passing the config directory to pg_ctl through
> > '-D' will eliminate the need -- as far as config and data directories are
> > concerned -- to pass options directly to the postmaster?
>
> Yes, in PG 9.2.

\o/

--
Mr. Aaron W. Swenson
Gentoo Linux Developer
Email    : titanofold@gentoo.org
GnuPG FP : 2C00 7719 4F85 FB07 A49C  0E31 5713 AA03 D1BB FDA0
GnuPG ID : D1BBFDA0

Re: Bug with pg_ctl -w/wait and config-only directories

From
Bruce Momjian
Date:
Bruce Momjian wrote:
> Bruce Momjian wrote:
> > Bruce Momjian wrote:
> > > Robert Haas wrote:
> > > > On Mon, Oct 3, 2011 at 11:04 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > > > > OK, here is a patch that adds a -C option to the postmaster so any
> > > > > config variable can be dumped, even while the server is running (there
> > > > > is no security check because we don't have a user name at this point),
> > 
> > OK, here is an updated patch, with documentation, that I consider ready
> > for commit to git head.
> 
> And with a --help documentation patch.

Patch applied.  This competes this TODO item:
Allow pg_ctl to work properly with configuration files located outsidethe PGDATA directory 

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +