Thread: Unhelpful initdb error message

Unhelpful initdb error message

From
Thom Brown
Date:
Hi all,

After building Postgres and trying an initdb, I'm getting the following:


thom@swift:~/Development$ initdb
The files belonging to this database system will be owned by user "thom".
This user must also own the server process.

The database cluster will be initialized with locale en_GB.UTF-8.
The default database encoding has accordingly been set to UTF8.
The default text search configuration will be set to "english".

fixing permissions on existing directory /home/thom/Development/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 10
selecting default shared_buffers ... 400kB
creating configuration files ... ok
creating template1 database in /home/thom/Development/data/base/1 ...
FATAL:  could not remove old lock file "postmaster.pid": No such file
or directory
HINT:  The file seems accidentally left over, but it could not be
removed. Please remove the file by hand and try again.
child process exited with exit code 1
initdb: removing contents of data directory "/home/thom/Development/data"


It can't remove an old lock file due to it not existing, but the hint
says it was left over but couldn't be removed.  The hint contradicts
the error message.  There is nothing in the data directory at all
before trying this, and nothing after.  Repeating initdb yields the
same result.

But, if I rename the data directory to something else and mkdir data
again, all is well.  I can make it break again by removing the new
data directory and renaming the old one back to data, still completely
empty.  Note that throughout all of this, Postgres is running, but as
a separate user and using completely separate directories, since it's
the standard packaged version on Debian.

Can anyone suggest what is wrong here?

--
Thom

Re: Unhelpful initdb error message

From
Tom Lane
Date:
Thom Brown <thom@linux.com> writes:
> thom@swift:~/Development$ initdb
> The files belonging to this database system will be owned by user "thom".
> This user must also own the server process.

> The database cluster will be initialized with locale en_GB.UTF-8.
> The default database encoding has accordingly been set to UTF8.
> The default text search configuration will be set to "english".

> fixing permissions on existing directory /home/thom/Development/data ... ok
> creating subdirectories ... ok
> selecting default max_connections ... 10
> selecting default shared_buffers ... 400kB
> creating configuration files ... ok
> creating template1 database in /home/thom/Development/data/base/1 ...
> FATAL:  could not remove old lock file "postmaster.pid": No such file
> or directory
> HINT:  The file seems accidentally left over, but it could not be
> removed. Please remove the file by hand and try again.
> child process exited with exit code 1
> initdb: removing contents of data directory "/home/thom/Development/data"

Um ... I assume this is some patched version rather than pristine
sources?  It's pretty hard to explain why it's falling over like that.

I don't think there is anything wrong with the error message, because
it's intended for the case where some previous postmaster failed and
left a lock file behind.  The question is how is it you're getting to
that error, not whether we should change its text.

One possible lead is that it looks like the postmaster-starting probes
to select max_connections and shared_buffers all failed too, since those
numbers came out as the minimums.

            regards, tom lane

Re: Unhelpful initdb error message

From
Adrian Klaver
Date:
On Tuesday, March 06, 2012 7:46:37 am Thom Brown wrote:
> Hi all,
>
> After building Postgres and trying an initdb, I'm getting the following:
>
>
> thom@swift:~/Development$ initdb
> The files belonging to this database system will be owned by user "thom".
> This user must also own the server process.
>
> The database cluster will be initialized with locale en_GB.UTF-8.
> The default database encoding has accordingly been set to UTF8.
> The default text search configuration will be set to "english".
>
> fixing permissions on existing directory /home/thom/Development/data ... ok
> creating subdirectories ... ok
> selecting default max_connections ... 10
> selecting default shared_buffers ... 400kB
> creating configuration files ... ok
> creating template1 database in /home/thom/Development/data/base/1 ...
> FATAL:  could not remove old lock file "postmaster.pid": No such file
> or directory
> HINT:  The file seems accidentally left over, but it could not be
> removed. Please remove the file by hand and try again.
> child process exited with exit code 1
> initdb: removing contents of data directory "/home/thom/Development/data"
>
>
> It can't remove an old lock file due to it not existing, but the hint
> says it was left over but couldn't be removed.  The hint contradicts
> the error message.  There is nothing in the data directory at all
> before trying this, and nothing after.  Repeating initdb yields the
> same result.
>
> But, if I rename the data directory to something else and mkdir data
> again, all is well.  I can make it break again by removing the new
> data directory and renaming the old one back to data, still completely
> empty.  Note that throughout all of this, Postgres is running, but as
> a separate user and using completely separate directories, since it's
> the standard packaged version on Debian.
>
> Can anyone suggest what is wrong here?

The postmaster.pid is located outside the data directory, but points back to the
data directory.   Not sure where Debian, though at a guess somewhere in /var.
Any way search for postmaster.pid.


--
Adrian Klaver
adrian.klaver@gmail.com

Re: Unhelpful initdb error message

From
Thom Brown
Date:
On 6 March 2012 16:02, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thom Brown <thom@linux.com> writes:
>> thom@swift:~/Development$ initdb
>> The files belonging to this database system will be owned by user "thom".
>> This user must also own the server process.
>
>> The database cluster will be initialized with locale en_GB.UTF-8.
>> The default database encoding has accordingly been set to UTF8.
>> The default text search configuration will be set to "english".
>
>> fixing permissions on existing directory /home/thom/Development/data ... ok
>> creating subdirectories ... ok
>> selecting default max_connections ... 10
>> selecting default shared_buffers ... 400kB
>> creating configuration files ... ok
>> creating template1 database in /home/thom/Development/data/base/1 ...
>> FATAL:  could not remove old lock file "postmaster.pid": No such file
>> or directory
>> HINT:  The file seems accidentally left over, but it could not be
>> removed. Please remove the file by hand and try again.
>> child process exited with exit code 1
>> initdb: removing contents of data directory "/home/thom/Development/data"
>
> Um ... I assume this is some patched version rather than pristine
> sources?  It's pretty hard to explain why it's falling over like that.

No, I did a "git stash", "git clean -f" and "git pull" before trying to build.

--
Thom

Re: Unhelpful initdb error message

From
Thom Brown
Date:
On 6 March 2012 16:04, Adrian Klaver <adrian.klaver@gmail.com> wrote:
> The postmaster.pid is located outside the data directory, but points back to the
> data directory.   Not sure where Debian, though at a guess somewhere in /var.
> Any way search for postmaster.pid.

I'm not sure, because if I use a new data directory, initdb it and
start the service, the postmaster.pid appears in it, and not as a
symbolic link.

I did a search for postmaster.pid in the whole of /var and it only
shows up "/var/lib/postgresql/9.1/main/postmaster.pid"

--
Thom

Re: Unhelpful initdb error message

From
Adrian Klaver
Date:
On Tuesday, March 06, 2012 8:11:20 am Thom Brown wrote:
> On 6 March 2012 16:04, Adrian Klaver <adrian.klaver@gmail.com> wrote:
> > The postmaster.pid is located outside the data directory, but points back
> > to the data directory.   Not sure where Debian, though at a guess
> > somewhere in /var. Any way search for postmaster.pid.
>
> I'm not sure, because if I use a new data directory, initdb it and
> start the service, the postmaster.pid appears in it, and not as a
> symbolic link.
>
> I did a search for postmaster.pid in the whole of /var and it only
> shows up "/var/lib/postgresql/9.1/main/postmaster.pid"


My guess is if you open that file you will find it points back to the old
directory.  So are you  still running the Debian packaged version of Postgres?
Or in other words does a ps show any other postmasters running other than the
new one you built?

--
Adrian Klaver
adrian.klaver@gmail.com

Re: Unhelpful initdb error message

From
Thom Brown
Date:
On 6 March 2012 16:11, Thom Brown <thom@linux.com> wrote:
> On 6 March 2012 16:04, Adrian Klaver <adrian.klaver@gmail.com> wrote:
>> The postmaster.pid is located outside the data directory, but points back to the
>> data directory.   Not sure where Debian, though at a guess somewhere in /var.
>> Any way search for postmaster.pid.
>
> I'm not sure, because if I use a new data directory, initdb it and
> start the service, the postmaster.pid appears in it, and not as a
> symbolic link.
>
> I did a search for postmaster.pid in the whole of /var and it only
> shows up "/var/lib/postgresql/9.1/main/postmaster.pid"

Correction, this is Ubuntu, not Debian.  11.10 if it's of any consequence.

The file system is ext4 with
rw,noatime,nodiratime,errors=remount-ro,commit=0 on a Crucial m4 SSD.

ecryptfs is in use in the parent directory.

--
Thom

Re: Unhelpful initdb error message

From
Thom Brown
Date:
On 6 March 2012 16:18, Adrian Klaver <adrian.klaver@gmail.com> wrote:
> On Tuesday, March 06, 2012 8:11:20 am Thom Brown wrote:
>> On 6 March 2012 16:04, Adrian Klaver <adrian.klaver@gmail.com> wrote:
>> > The postmaster.pid is located outside the data directory, but points back
>> > to the data directory.   Not sure where Debian, though at a guess
>> > somewhere in /var. Any way search for postmaster.pid.
>>
>> I'm not sure, because if I use a new data directory, initdb it and
>> start the service, the postmaster.pid appears in it, and not as a
>> symbolic link.
>>
>> I did a search for postmaster.pid in the whole of /var and it only
>> shows up "/var/lib/postgresql/9.1/main/postmaster.pid"
>
>
> My guess is if you open that file you will find it points back to the old
> directory.  So are you  still running the Debian packaged version of Postgres?
> Or in other words does a ps show any other postmasters running other than the
> new one you built?

No, only the ones running as the postgres user.

Here's the contents of the pid file in /var/lib/postgresql/9.1/main/

1199
/var/lib/postgresql/9.1/main
1330883367
5432
/var/run/postgresql
localhost
  5432001         0

And if I start my development copy, this is the content of its postmaster.pid:

27061
/home/thom/Development/data
1331050950
5488
/tmp
localhost
  5488001 191365126

--
Thom

Re: Unhelpful initdb error message

From
Tom Lane
Date:
Thom Brown <thom@linux.com> writes:
> On 6 March 2012 16:02, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Um ... I assume this is some patched version rather than pristine
>> sources? �It's pretty hard to explain why it's falling over like that.

> No, I did a "git stash", "git clean -f" and "git pull" before trying to build.

[ scratches head... ]  I can't reproduce it with current git tip.

            regards, tom lane

Re: Unhelpful initdb error message

From
Thom Brown
Date:
On 6 March 2012 16:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thom Brown <thom@linux.com> writes:
>> On 6 March 2012 16:02, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Um ... I assume this is some patched version rather than pristine
>>> sources?  It's pretty hard to explain why it's falling over like that.
>
>> No, I did a "git stash", "git clean -f" and "git pull" before trying to build.
>
> [ scratches head... ]  I can't reproduce it with current git tip.

And I don't think I can reproduce this if I remove that directory.
I've seen this issue about 3 or 4 times in the past, and fixed it by
ditching the old data dir completely.  I'm just not sure what causes
this to happen.

Looking back through my terminal log, one thing might lend a clue from
before I tried rebuliding it:

thom@swift:~/Development$ pg_ctl stop
waiting for server to shut down....cd .postgre.s
.............
....



....^C
thom@swift:~/Development$ pg_ctl stop
pg_ctl: could not send stop signal (PID: 2807): No such process
thom@swift:~/Development$ ps -ef | grep postgres
postgres  1199     1  0 Mar04 ?        00:00:01
/usr/lib/postgresql/9.1/bin/postgres -D /var/lib/postgresql/9.1/main
-c config_file=/etc/postgresql/9.1/main/postgresql.conf
postgres  1273  1199  0 Mar04 ?        00:00:18 postgres: writer
process
postgres  1274  1199  0 Mar04 ?        00:00:14 postgres: wal writer
process
postgres  1275  1199  0 Mar04 ?        00:00:03 postgres: autovacuum
launcher process
postgres  1276  1199  0 Mar04 ?        00:00:02 postgres: stats
collector process
thom     16476  4302  0 15:30 pts/1    00:00:00 grep --color=auto postgres


Postgres wouldn't shut down.  I had no other terminal windows using
psql, no other database client apps open, yet it stayed shutting down,
so I CTRL+C'd it and tried again.  A quick check of running processes
showed that it had stopped running. (it shows postgres running above,
but the dev copy runs as my user, not postgres)

--
Thom

Re: Unhelpful initdb error message

From
Adrian Klaver
Date:
On Tuesday, March 06, 2012 8:24:20 am Thom Brown wrote:
>
>
> No, only the ones running as the postgres user.

In my original read, I missed the part you had the Ubuntu/Debian packaged
version running.

>
> Here's the contents of the pid file in /var/lib/postgresql/9.1/main/
>
> 1199
> /var/lib/postgresql/9.1/main
> 1330883367
> 5432
> /var/run/postgresql
> localhost
>   5432001         0
>
> And if I start my development copy, this is the content of its
> postmaster.pid:
>
> 27061
> /home/thom/Development/data
> 1331050950
> 5488
> /tmp
> localhost
>   5488001 191365126

So how are getting the file above? I thought initdb refused to init the directory
and that you could not find pid file it was referring to? Just on a hunch, what is
in /tmp?

--
Adrian Klaver
adrian.klaver@gmail.com

Re: Unhelpful initdb error message

From
Thom Brown
Date:
On 6 March 2012 16:40, Adrian Klaver <adrian.klaver@gmail.com> wrote:
> On Tuesday, March 06, 2012 8:24:20 am Thom Brown wrote:
>>
>>
>> No, only the ones running as the postgres user.
>
> In my original read, I missed the part you had the Ubuntu/Debian packaged
> version running.
>
>>
>> Here's the contents of the pid file in /var/lib/postgresql/9.1/main/
>>
>> 1199
>> /var/lib/postgresql/9.1/main
>> 1330883367
>> 5432
>> /var/run/postgresql
>> localhost
>>   5432001         0
>>
>> And if I start my development copy, this is the content of its
>> postmaster.pid:
>>
>> 27061
>> /home/thom/Development/data
>> 1331050950
>> 5488
>> /tmp
>> localhost
>>   5488001 191365126
>
> So how are getting the file above? I thought initdb refused to init the directory
> and that you could not find pid file it was referring to? Just on a hunch, what is
> in /tmp?

I got the above output when I created a new data directory and initdb'd it.

/tmp shows:

     4 -rw-------  1 thom    thom           55 2012-03-06 16:22
.s.PGSQL.5488.lock
     0 srwxrwxrwx  1 thom    thom            0 2012-03-06 16:22 .s.PGSQL.5488

Once it's up and running.  These disappear after though.  When using
the old data directory again, there's no evidence of anything like
this in /tmp.

--
Thom

Re: Unhelpful initdb error message

From
Adrian Klaver
Date:
On Tuesday, March 06, 2012 8:44:10 am Thom Brown wrote:

> >> And if I start my development copy, this is the content of its
> >> postmaster.pid:
> >>
> >> 27061
> >> /home/thom/Development/data
> >> 1331050950
> >> 5488
> >> /tmp
> >> localhost
> >>   5488001 191365126
> >
> > So how are getting the file above? I thought initdb refused to init the
> > directory and that you could not find pid file it was referring to? Just
> > on a hunch, what is in /tmp?
>
> I got the above output when I created a new data directory and initdb'd it.

Still not understanding. In your original post you said
/home/thom/Development/data was the original directory you could not initdb. How
could it also be the new directory you can initdb as indicated by the
postmaster.pid?


From your previous post:
  thom@swift:~/Development$ pg_ctl stop
  pg_ctl: could not send stop signal (PID: 2807): No such process

Doing the above without qualifying which version of pg_ctl you are using or what
data directory you are pointing is dangerous.  The combination of  implied
pathing and preset env variables could lead to all sorts of mischief.


>
> /tmp shows:
>
>      4 -rw-------  1 thom    thom           55 2012-03-06 16:22
> .s.PGSQL.5488.lock
>      0 srwxrwxrwx  1 thom    thom            0 2012-03-06 16:22
> .s.PGSQL.5488
>
> Once it's up and running.  These disappear after though.  When using
> the old data directory again, there's no evidence of anything like
> this in /tmp.

--
Adrian Klaver
adrian.klaver@gmail.com

Re: Unhelpful initdb error message

From
Thom Brown
Date:
On 6 March 2012 17:00, Adrian Klaver <adrian.klaver@gmail.com> wrote:
> On Tuesday, March 06, 2012 8:44:10 am Thom Brown wrote:
>
>> >> And if I start my development copy, this is the content of its
>> >> postmaster.pid:
>> >>
>> >> 27061
>> >> /home/thom/Development/data
>> >> 1331050950
>> >> 5488
>> >> /tmp
>> >> localhost
>> >>   5488001 191365126
>> >
>> > So how are getting the file above? I thought initdb refused to init the
>> > directory and that you could not find pid file it was referring to? Just
>> > on a hunch, what is in /tmp?
>>
>> I got the above output when I created a new data directory and initdb'd it.
>
> Still not understanding. In your original post you said
> /home/thom/Development/data was the original directory you could not initdb. How
> could it also be the new directory you can initdb as indicated by the
> postmaster.pid?

/home/thom/Development/data was causing problems so:

mv data databroken
mkdir data
initdb

... working fine again.  I then used the postmaster.pid from this when
started up.  But if I do:

pg_ctl stop
rm -rf data
mv databroken data
initdb

... error messages appear again.

> From your previous post:
>  thom@swift:~/Development$ pg_ctl stop
>  pg_ctl: could not send stop signal (PID: 2807): No such process
>
> Doing the above without qualifying which version of pg_ctl you are using or what
> data directory you are pointing is dangerous.  The combination of  implied
> pathing and preset env variables could lead to all sorts of mischief.

Unlikely since pg_ctl isn't available in my search path once I remove
my local development bin dir from it.  All non-client tools for the
packaged version aren't available to normal users.  Those are all in
/usr/lib/postgresql/9.1/bin.  The only ones exposed to the search path
through symbolic links are:

clusterdb
createdb
createlang
createuser
dropdb
droplang
dropuser
pg_dump
pg_dumpall
pg_restore
psql
reindexdb
vacuumdb
vacuumlo

--
Thom

Re: Unhelpful initdb error message

From
Tom Lane
Date:
Thom Brown <thom@linux.com> writes:
> Looking back through my terminal log, one thing might lend a clue from
> before I tried rebuliding it:

> thom@swift:~/Development$ pg_ctl stop
> waiting for server to shut down....cd .postgre.s
> .............
> ....



> ....^C
> thom@swift:~/Development$ pg_ctl stop
> pg_ctl: could not send stop signal (PID: 2807): No such process
> thom@swift:~/Development$ ps -ef | grep postgres
> postgres  1199     1  0 Mar04 ?        00:00:01
> /usr/lib/postgresql/9.1/bin/postgres -D /var/lib/postgresql/9.1/main
> -c config_file=/etc/postgresql/9.1/main/postgresql.conf
> postgres  1273  1199  0 Mar04 ?        00:00:18 postgres: writer
> process
> postgres  1274  1199  0 Mar04 ?        00:00:14 postgres: wal writer
> process
> postgres  1275  1199  0 Mar04 ?        00:00:03 postgres: autovacuum
> launcher process
> postgres  1276  1199  0 Mar04 ?        00:00:02 postgres: stats
> collector process
> thom     16476  4302  0 15:30 pts/1    00:00:00 grep --color=auto postgres

Hm.  It looks like pg_ctl found a PID file pointing to a non-existent
process, which is a bit like what you're seeing initdb do.

I wonder whether this is somehow caused by conflicting settings for
PGDATA.  Do you have a setting for that in your environment, or .bashrc
or someplace, that is different from what you're trying to use?

            regards, tom lane

Re: Unhelpful initdb error message

From
Adrian Klaver
Date:
On Tuesday, March 06, 2012 9:09:41 am Thom Brown wrote:
> On 6 March 2012 17:00, Adrian Klaver <adrian.klaver@gmail.com> wrote:
> > On Tuesday, March 06, 2012 8:44:10 am Thom Brown wrote:
> >> >> And if I start my development copy, this is the content of its
> >> >> postmaster.pid:
> >> >>
> >> >> 27061
> >> >> /home/thom/Development/data
> >> >> 1331050950
> >> >> 5488
> >> >> /tmp
> >> >> localhost
> >> >>   5488001 191365126
> >> >
> >> > So how are getting the file above? I thought initdb refused to init
> >> > the directory and that you could not find pid file it was referring
> >> > to? Just on a hunch, what is in /tmp?
> >>
> >> I got the above output when I created a new data directory and initdb'd
> >> it.
> >
> > Still not understanding. In your original post you said
> > /home/thom/Development/data was the original directory you could not
> > initdb. How could it also be the new directory you can initdb as
> > indicated by the postmaster.pid?
>
> /home/thom/Development/data was causing problems so:
>
> mv data databroken
> mkdir data
> initdb
>
> ... working fine again.  I then used the postmaster.pid from this when
> started up.  But if I do:
>
> pg_ctl stop
> rm -rf data
> mv databroken data
> initdb
>
> ... error messages appear again.


Humph, need more coffee.

>
> > From your previous post:
> >  thom@swift:~/Development$ pg_ctl stop
> >  pg_ctl: could not send stop signal (PID: 2807): No such process
> >
> > Doing the above without qualifying which version of pg_ctl you are using
> > or what data directory you are pointing is dangerous.  The combination
> > of  implied pathing and preset env variables could lead to all sorts of
> > mischief.
>
> Unlikely since pg_ctl isn't available in my search path once I remove
> my local development bin dir from it.  All non-client tools for the
> packaged version aren't available to normal users.  Those are all in
> /usr/lib/postgresql/9.1/bin.  The only ones exposed to the search path
> through symbolic links are:

env variables?

--
Adrian Klaver
adrian.klaver@gmail.com

Re: Unhelpful initdb error message

From
Thom Brown
Date:
On 6 March 2012 17:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thom Brown <thom@linux.com> writes:
>> Looking back through my terminal log, one thing might lend a clue from
>> before I tried rebuliding it:
>
>> thom@swift:~/Development$ pg_ctl stop
>> waiting for server to shut down....cd .postgre.s
>> .............
>> ....
>
>
>
>> ....^C
>> thom@swift:~/Development$ pg_ctl stop
>> pg_ctl: could not send stop signal (PID: 2807): No such process
>> thom@swift:~/Development$ ps -ef | grep postgres
>> postgres  1199     1  0 Mar04 ?        00:00:01
>> /usr/lib/postgresql/9.1/bin/postgres -D /var/lib/postgresql/9.1/main
>> -c config_file=/etc/postgresql/9.1/main/postgresql.conf
>> postgres  1273  1199  0 Mar04 ?        00:00:18 postgres: writer
>> process
>> postgres  1274  1199  0 Mar04 ?        00:00:14 postgres: wal writer
>> process
>> postgres  1275  1199  0 Mar04 ?        00:00:03 postgres: autovacuum
>> launcher process
>> postgres  1276  1199  0 Mar04 ?        00:00:02 postgres: stats
>> collector process
>> thom     16476  4302  0 15:30 pts/1    00:00:00 grep --color=auto postgres
>
> Hm.  It looks like pg_ctl found a PID file pointing to a non-existent
> process, which is a bit like what you're seeing initdb do.
>
> I wonder whether this is somehow caused by conflicting settings for
> PGDATA.  Do you have a setting for that in your environment, or .bashrc
> or someplace, that is different from what you're trying to use?

These are in my env output:


PATH=/home/thom/Development/psql/bin/:/usr/lib/lightdm/lightdm:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
PGDATA=/home/thom/Development/data/
PGPORT=5488

This appears in my build script before configure:

export PGDATA=$HOME/Development/data/
export PATH=$HOME/Development/psql/bin/:$PATH
export PGPORT=5488

And those 3 lines also appear in my .bashrc file without any variation:

export PGDATA=$HOME/Development/data/
export PATH=$HOME/Development/psql/bin/:$PATH
export PGPORT=5488

--
Thom

Re: Unhelpful initdb error message

From
Tom Lane
Date:
Adrian Klaver <adrian.klaver@gmail.com> writes:
> The postmaster.pid is located outside the data directory, but points back to the
> data directory.   Not sure where Debian, though at a guess somewhere in /var.
> Any way search for postmaster.pid.

Really?  That seems like an extremely dangerous/stupid/unnecessary hack
on the part of the Debian packagers.  What's keeping users from
accidentally starting two postmasters in the same data directory, if
they can put their pidfiles in (different) other places?

(This seems unrelated to Thom's issue, but it's still worrisome.)

            regards, tom lane

Re: Unhelpful initdb error message

From
Adrian Klaver
Date:
On Tuesday, March 06, 2012 9:25:17 am Thom Brown wrote:

>
> These are in my env output:
>
> PATH=/home/thom/Development/psql/bin/:/usr/lib/lightdm/lightdm:/usr/local/s
> bin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
> PGDATA=/home/thom/Development/data/
> PGPORT=5488
>
> This appears in my build script before configure:
>
> export PGDATA=$HOME/Development/data/
> export PATH=$HOME/Development/psql/bin/:$PATH
> export PGPORT=5488
>
> And those 3 lines also appear in my .bashrc file without any variation:
>
> export PGDATA=$HOME/Development/data/
> export PATH=$HOME/Development/psql/bin/:$PATH
> export PGPORT=5488

And you are sure there is no pg_ctl or initdb outside
/usr/lib/postgresql/9.1/bin or /home/thom/Development/psql/bin and in your PATH?

Just for grins what happens if you try an initdb using an explicit reference to
the binary /home/thom/Development/psql/bin/initdb and the -D
/home/thom/Development/data/ ?

--
Adrian Klaver
adrian.klaver@gmail.com

Re: Unhelpful initdb error message

From
Tom Lane
Date:
Thom Brown <thom@linux.com> writes:
> On 6 March 2012 16:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> [ scratches head... ] �I can't reproduce it with current git tip.

> And I don't think I can reproduce this if I remove that directory.
> I've seen this issue about 3 or 4 times in the past, and fixed it by
> ditching the old data dir completely.  I'm just not sure what causes
> this to happen.

I'm a bit confused here.  Isn't the data directory totally empty before
initdb starts?  It's supposed to refuse to proceed otherwise.

            regards, tom lane

Re: Unhelpful initdb error message

From
Adrian Klaver
Date:
On Tuesday, March 06, 2012 9:43:00 am Tom Lane wrote:
> Adrian Klaver <adrian.klaver@gmail.com> writes:
> > The postmaster.pid is located outside the data directory, but points back
> > to the data directory.   Not sure where Debian, though at a guess
> > somewhere in /var. Any way search for postmaster.pid.
>
> Really?  That seems like an extremely dangerous/stupid/unnecessary hack
> on the part of the Debian packagers.  What's keeping users from
> accidentally starting two postmasters in the same data directory, if
> they can put their pidfiles in (different) other places?

No, that was a mistake on my part. It is in the $DATA directory.

>
> (This seems unrelated to Thom's issue, but it's still worrisome.)
>
>             regards, tom lane

--
Adrian Klaver
adrian.klaver@gmail.com

Re: Unhelpful initdb error message

From
Thom Brown
Date:
On 6 March 2012 17:45, Adrian Klaver <adrian.klaver@gmail.com> wrote:
> On Tuesday, March 06, 2012 9:25:17 am Thom Brown wrote:
>
>>
>> These are in my env output:
>>
>> PATH=/home/thom/Development/psql/bin/:/usr/lib/lightdm/lightdm:/usr/local/s
>> bin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
>> PGDATA=/home/thom/Development/data/
>> PGPORT=5488
>>
>> This appears in my build script before configure:
>>
>> export PGDATA=$HOME/Development/data/
>> export PATH=$HOME/Development/psql/bin/:$PATH
>> export PGPORT=5488
>>
>> And those 3 lines also appear in my .bashrc file without any variation:
>>
>> export PGDATA=$HOME/Development/data/
>> export PATH=$HOME/Development/psql/bin/:$PATH
>> export PGPORT=5488
>
> And you are sure there is no pg_ctl or initdb outside
> /usr/lib/postgresql/9.1/bin or /home/thom/Development/psql/bin and in your PATH?
>
> Just for grins what happens if you try an initdb using an explicit reference to
> the binary /home/thom/Development/psql/bin/initdb and the -D
> /home/thom/Development/data/ ?

thom@swift:~/Development$ /home/thom/Development/psql/bin/initdb -E
'UTF8' -D /home/thom/Development/data/
The files belonging to this database system will be owned by user "thom".
This user must also own the server process.

The database cluster will be initialized with locale en_GB.UTF-8.
The default text search configuration will be set to "english".

fixing permissions on existing directory /home/thom/Development/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 10
selecting default shared_buffers ... 400kB
creating configuration files ... ok
creating template1 database in /home/thom/Development/data/base/1 ...
FATAL:  could not remove old lock file "postmaster.pid": No such file
or directory
HINT:  The file seems accidentally left over, but it could not be
removed. Please remove the file by hand and try again.
child process exited with exit code 1
initdb: removing contents of data directory "/home/thom/Development/data"

--
Thom

Re: Unhelpful initdb error message

From
Thom Brown
Date:
On 6 March 2012 17:46, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thom Brown <thom@linux.com> writes:
>> On 6 March 2012 16:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> [ scratches head... ]  I can't reproduce it with current git tip.
>
>> And I don't think I can reproduce this if I remove that directory.
>> I've seen this issue about 3 or 4 times in the past, and fixed it by
>> ditching the old data dir completely.  I'm just not sure what causes
>> this to happen.
>
> I'm a bit confused here.  Isn't the data directory totally empty before
> initdb starts?  It's supposed to refuse to proceed otherwise.

Yes, it is completely empty:

thom@swift:~/Development$ ls -la data
total 8
drwx------  2 thom thom 4096 2012-03-06 17:48 .
drwxrwxr-x 15 thom thom 4096 2012-03-06 17:46 ..

--
Thom

Re: Unhelpful initdb error message

From
Tom Lane
Date:
Thom Brown <thom@linux.com> writes:
> /home/thom/Development/data was causing problems so:

> mv data databroken
> mkdir data
> initdb

> ... working fine again.  I then used the postmaster.pid from this when
> started up.  But if I do:

> pg_ctl stop
> rm -rf data
> mv databroken data
> initdb

> ... error messages appear again.

Okay, so the question becomes: what is different between databroken and
a freshly mkdir'd empty directory?  If there is no visible difference in
contents, ownership, or permissions, then it seems like this is evidence
of a filesystem bug (ie, apparently empty directory acts nonempty for
some operations).

            regards, tom lane

Re: Unhelpful initdb error message

From
Adrian Klaver
Date:
On Tuesday, March 06, 2012 9:48:51 am Thom Brown wrote:
> On 6 March 2012 17:45, Adrian Klaver <adrian.klaver@gmail.com> wrote:
> > On Tuesday, March 06, 2012 9:25:17 am Thom Brown wrote:
> >> These are in my env output:
> >>
> >> PATH=/home/thom/Development/psql/bin/:/usr/lib/lightdm/lightdm:/usr/loca
> >> l/s bin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
> >> PGDATA=/home/thom/Development/data/
> >> PGPORT=5488
> >>
> >> This appears in my build script before configure:
> >>
> >> export PGDATA=$HOME/Development/data/
> >> export PATH=$HOME/Development/psql/bin/:$PATH
> >> export PGPORT=5488
> >>
> >> And those 3 lines also appear in my .bashrc file without any variation:
> >>
> >> export PGDATA=$HOME/Development/data/
> >> export PATH=$HOME/Development/psql/bin/:$PATH
> >> export PGPORT=5488
> >
> > And you are sure there is no pg_ctl or initdb outside
> > /usr/lib/postgresql/9.1/bin or /home/thom/Development/psql/bin and in
> > your PATH?

So that would be no:)?

> >
> > Just for grins what happens if you try an initdb using an explicit
> > reference to the binary /home/thom/Development/psql/bin/initdb and the
> > -D
> > /home/thom/Development/data/ ?
>
> thom@swift:~/Development$ /home/thom/Development/psql/bin/initdb -E
> 'UTF8' -D /home/thom/Development/data/
> The files belonging to this database system will be owned by user "thom".
> This user must also own the server process.
>
> The database cluster will be initialized with locale en_GB.UTF-8.
> The default text search configuration will be set to "english".
>
> fixing permissions on existing directory /home/thom/Development/data ... ok
> creating subdirectories ... ok
> selecting default max_connections ... 10
> selecting default shared_buffers ... 400kB
> creating configuration files ... ok
> creating template1 database in /home/thom/Development/data/base/1 ...
> FATAL:  could not remove old lock file "postmaster.pid": No such file
> or directory
> HINT:  The file seems accidentally left over, but it could not be
> removed. Please remove the file by hand and try again.
> child process exited with exit code 1
> initdb: removing contents of data directory "/home/thom/Development/data"

Its official, I'm stumped.  Information seems to be persisting between sessions
and absent some other cluster then the ones you have indicated I don't where
that information is coming from?

--
Adrian Klaver
adrian.klaver@gmail.com

Re: Unhelpful initdb error message

From
Thom Brown
Date:
On 6 March 2012 17:53, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thom Brown <thom@linux.com> writes:
>> /home/thom/Development/data was causing problems so:
>
>> mv data databroken
>> mkdir data
>> initdb
>
>> ... working fine again.  I then used the postmaster.pid from this when
>> started up.  But if I do:
>
>> pg_ctl stop
>> rm -rf data
>> mv databroken data
>> initdb
>
>> ... error messages appear again.
>
> Okay, so the question becomes: what is different between databroken and
> a freshly mkdir'd empty directory?  If there is no visible difference in
> contents, ownership, or permissions, then it seems like this is evidence
> of a filesystem bug (ie, apparently empty directory acts nonempty for
> some operations).

You may well be right.  There appear to be dark forces at work here:

thom@swift:~/Development/data$ touch postmaster.pid
thom@swift:~/Development/data$ ls -l
total 0
thom@swift:~/Development/data$ touch file.txt
thom@swift:~/Development/data$ ls -l
total 8
-rw-rw-r-- 1 thom thom 0 2012-03-06 17:59 file.txt

--
Thom

Re: Unhelpful initdb error message

From
Adrian Klaver
Date:
On Tuesday, March 06, 2012 9:53:52 am Tom Lane wrote:
> Thom Brown <thom@linux.com> writes:
> > /home/thom/Development/data was causing problems so:
> >
> > mv data databroken
> > mkdir data
> > initdb
> >
> > ... working fine again.  I then used the postmaster.pid from this when
> > started up.  But if I do:
> >
> > pg_ctl stop
> > rm -rf data
> > mv databroken data
> > initdb
> >
> > ... error messages appear again.
>
> Okay, so the question becomes: what is different between databroken and
> a freshly mkdir'd empty directory?  If there is no visible difference in
> contents, ownership, or permissions, then it seems like this is evidence
> of a filesystem bug (ie, apparently empty directory acts nonempty for
> some operations).

A thought, what if you do rm -rf * in the data directory?

>
>             regards, tom lane

--
Adrian Klaver
adrian.klaver@gmail.com

Re: Unhelpful initdb error message

From
Thom Brown
Date:
On 6 March 2012 18:01, Adrian Klaver <adrian.klaver@gmail.com> wrote:
> On Tuesday, March 06, 2012 9:53:52 am Tom Lane wrote:
>> Thom Brown <thom@linux.com> writes:
>> > /home/thom/Development/data was causing problems so:
>> >
>> > mv data databroken
>> > mkdir data
>> > initdb
>> >
>> > ... working fine again.  I then used the postmaster.pid from this when
>> > started up.  But if I do:
>> >
>> > pg_ctl stop
>> > rm -rf data
>> > mv databroken data
>> > initdb
>> >
>> > ... error messages appear again.
>>
>> Okay, so the question becomes: what is different between databroken and
>> a freshly mkdir'd empty directory?  If there is no visible difference in
>> contents, ownership, or permissions, then it seems like this is evidence
>> of a filesystem bug (ie, apparently empty directory acts nonempty for
>> some operations).
>
> A thought, what if you do rm -rf * in the data directory?

I've done that a couple times, but no effect.  I think Tom's point
about a filesystem bug is probably right.

--
Thom

Re: Unhelpful initdb error message

From
Magnus Hagander
Date:
On Tue, Mar 6, 2012 at 19:03, Thom Brown <thom@linux.com> wrote:
> On 6 March 2012 18:01, Adrian Klaver <adrian.klaver@gmail.com> wrote:
>> On Tuesday, March 06, 2012 9:53:52 am Tom Lane wrote:
>>> Thom Brown <thom@linux.com> writes:
>>> > /home/thom/Development/data was causing problems so:
>>> >
>>> > mv data databroken
>>> > mkdir data
>>> > initdb
>>> >
>>> > ... working fine again.  I then used the postmaster.pid from this when
>>> > started up.  But if I do:
>>> >
>>> > pg_ctl stop
>>> > rm -rf data
>>> > mv databroken data
>>> > initdb
>>> >
>>> > ... error messages appear again.
>>>
>>> Okay, so the question becomes: what is different between databroken and
>>> a freshly mkdir'd empty directory?  If there is no visible difference in
>>> contents, ownership, or permissions, then it seems like this is evidence
>>> of a filesystem bug (ie, apparently empty directory acts nonempty for
>>> some operations).
>>
>> A thought, what if you do rm -rf * in the data directory?
>
> I've done that a couple times, but no effect.  I think Tom's point
> about a filesystem bug is probably right.

You mentioned encryptfs, right? That's where I'd be looking first :-O

it wasn't obvious enough to throw something in your kernel dmesg log
by any chance? :-)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: Unhelpful initdb error message

From
Tom Lane
Date:
Thom Brown <thom@linux.com> writes:
> On 6 March 2012 18:01, Adrian Klaver <adrian.klaver@gmail.com> wrote:
>> A thought, what if you do rm -rf * in the data directory?

> I've done that a couple times, but no effect.  I think Tom's point
> about a filesystem bug is probably right.

Yeah, given your "touch" experiment I think that you have more than
enough ammunition to file a kernel bug.  Apparently, the directory
contents are corrupted in such a way that a file named "postmaster.pid"
can be created but it's invisible to some (perhaps not all) operations.
In some of the more complex directory data structures I could believe
that this result is filename-sensitive (think corrupted hashtable...)

            regards, tom lane

Re: Unhelpful initdb error message

From
Bosco Rama
Date:
Sry, forgot to add list.

Thom Brown wrote:
>
> I've done that a couple times, but no effect.  I think Tom's point
> about a filesystem bug is probably right.

Have you rebooted since this started?  There may be a process that is
holding the pid file 'deleted but present' until the process terminates.

If you can't find the process to kill it a reboot would remove all doubt.

Just a thought.

Bosco.

Re: Unhelpful initdb error message

From
Tom Lane
Date:
Bosco Rama <postgres@boscorama.com> writes:
> Thom Brown wrote:
>> I've done that a couple times, but no effect.  I think Tom's point
>> about a filesystem bug is probably right.

> Have you rebooted since this started?  There may be a process that is
> holding the pid file 'deleted but present' until the process terminates.

Even if something is holding the file open, that wouldn't prevent unlink
from removing the directory entry for it; or even if we were talking
about a badly-designed filesystem that failed to follow standard Unix
semantics, that wouldn't explain why the directory entry is apparently
visible to some operations but not others.

Still, I agree with your point: Thom should reboot and see if the
misbehavior is still there, because that would be useful info for his
bug report.

            regards, tom lane

Re: Unhelpful initdb error message

From
dennis jenkins
Date:
On Tue, Mar 6, 2012 at 10:11 AM, Thom Brown <thom@linux.com> wrote:
> On 6 March 2012 16:04, Adrian Klaver <adrian.klaver@gmail.com> wrote:
>> The postmaster.pid is located outside the data directory, but points back to the
>> data directory.   Not sure where Debian, though at a guess somewhere in /var.
>> Any way search for postmaster.pid.
>
> I'm not sure, because if I use a new data directory, initdb it and
> start the service, the postmaster.pid appears in it, and not as a
> symbolic link.
>
> I did a search for postmaster.pid in the whole of /var and it only
> shows up "/var/lib/postgresql/9.1/main/postmaster.pid"
>
> --
> Thom

I know that I'm late to the party, but a small suggestion: Run
"initdb" with "strace" (truss on Solaris) and examine the syscalls
made.  It should show you, conclusively, what files are being
"open"ed, "unlink"ed, etc...

Example:

strace -o /tmp/x initdb -D /tmp/data-1
grep -E '^(open|unlink)' /tmp/x

Re: Unhelpful initdb error message

From
Thom Brown
Date:
On 6 March 2012 18:20, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Bosco Rama <postgres@boscorama.com> writes:
>> Thom Brown wrote:
>>> I've done that a couple times, but no effect.  I think Tom's point
>>> about a filesystem bug is probably right.
>
>> Have you rebooted since this started?  There may be a process that is
>> holding the pid file 'deleted but present' until the process terminates.
>
> Even if something is holding the file open, that wouldn't prevent unlink
> from removing the directory entry for it; or even if we were talking
> about a badly-designed filesystem that failed to follow standard Unix
> semantics, that wouldn't explain why the directory entry is apparently
> visible to some operations but not others.
>
> Still, I agree with your point: Thom should reboot and see if the
> misbehavior is still there, because that would be useful info for his
> bug report.

After a reboot, initdb completes successfully.  I don't think it
performed an fsck of any kind as I don't see it in the logs.

--
Thom

Re: Unhelpful initdb error message

From
Thom Brown
Date:
On 6 March 2012 18:51, dennis jenkins <dennis.jenkins.75@gmail.com> wrote:
> On Tue, Mar 6, 2012 at 10:11 AM, Thom Brown <thom@linux.com> wrote:
>> On 6 March 2012 16:04, Adrian Klaver <adrian.klaver@gmail.com> wrote:
>>> The postmaster.pid is located outside the data directory, but points back to the
>>> data directory.   Not sure where Debian, though at a guess somewhere in /var.
>>> Any way search for postmaster.pid.
>>
>> I'm not sure, because if I use a new data directory, initdb it and
>> start the service, the postmaster.pid appears in it, and not as a
>> symbolic link.
>>
>> I did a search for postmaster.pid in the whole of /var and it only
>> shows up "/var/lib/postgresql/9.1/main/postmaster.pid"
>>
>> --
>> Thom
>
> I know that I'm late to the party, but a small suggestion: Run
> "initdb" with "strace" (truss on Solaris) and examine the syscalls
> made.  It should show you, conclusively, what files are being
> "open"ed, "unlink"ed, etc...
>
> Example:
>
> strace -o /tmp/x initdb -D /tmp/data-1
> grep -E '^(open|unlink)' /tmp/x

The reboot removed the opportunity to do this unfortunately.  I'll
have to wait an see if it happens again, but if it does, I'll try the
suggestion.

--
Thom

Re: Unhelpful initdb error message

From
Tom Lane
Date:
Thom Brown <thom@linux.com> writes:
> On 6 March 2012 18:20, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Still, I agree with your point: Thom should reboot and see if the
>> misbehavior is still there, because that would be useful info for his
>> bug report.

> After a reboot, initdb completes successfully.  I don't think it
> performed an fsck of any kind as I don't see it in the logs.

Fascinating.  So maybe there is something to Bosco's theory of something
holding open the old pidfile.  But what would that be?  The postmaster
doesn't hold it open, just write it and close it.

            regards, tom lane

Re: Unhelpful initdb error message

From
Thom Brown
Date:
On 6 March 2012 19:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thom Brown <thom@linux.com> writes:
>> On 6 March 2012 18:20, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Still, I agree with your point: Thom should reboot and see if the
>>> misbehavior is still there, because that would be useful info for his
>>> bug report.
>
>> After a reboot, initdb completes successfully.  I don't think it
>> performed an fsck of any kind as I don't see it in the logs.
>
> Fascinating.  So maybe there is something to Bosco's theory of something
> holding open the old pidfile.  But what would that be?  The postmaster
> doesn't hold it open, just write it and close it.

No idea.  I did run an lsof while the problem was still present and
grep'd for the directory as I too suspected there may be some process
thinking it still had a reference to the file, but there were no
matches.

--
Thom

Re: Unhelpful initdb error message

From
Bosco Rama
Date:
Tom Lane wrote:
>
> Fascinating.  So maybe there is something to Bosco's theory of something
> holding open the old pidfile.

There could also have been a corrupt in-memory/cached descriptor in the
filesystem code that never needed flushing to disk?  That would help
explain why it fully went away after the reboot and yet the on-disk stuff
seems fine.

>  But what would that be?

Possibly a 3rd party/home-grown monitoring program?

Bosco.