Thread: Can't restart Postgres

Can't restart Postgres

From
"Thom Brown"
Date:
Hi,

I've got a development virtual server which matches live exactly
except for the fact that Postgres is running on a different port which
is not used by anything else.  Postgres was running fine until I
updated postgresql.conf to enhance logging and make better use of
system resources.

Here's the problem:

# /etc/init.d/postgresql-8.3 restart
 * Stopping PostgreSQL (this can take up to 90 seconds) ...
pg_ctl: PID file "/var/lib/postgresql/8.3/data/postmaster.pid" does not exist
Is server running?
 * Some clients did not disconnect within 30 seconds.
 * Going to shutdown the server anyway.
pg_ctl: PID file "/var/lib/postgresql/8.3/data/postmaster.pid" does not exist
Is server running?
 * Shutting down the server gracefully failed.
 * Forcing it to shutdown which leads to a recover-run on next startup.
pg_ctl: PID file "/var/lib/postgresql/8.3/data/postmaster.pid" does not exist
Is server running?
 * Forced shutdown failed!!! Something is wrong with your system,
please take care of it manually.
               [ ok ]
 * Starting PostgreSQL ...
waiting for server to
start...............................................................could
not start server                                        [ !! ]
 * The pid-file doesn't exist but pg_ctl reported a running server.
 * Please check whether there is another server running on the same
port or read the log-file.


If you're curious, the settings I changed in postgresql.conf are as follows:

OLD: shared_buffers = 24MB
NEW: shared_buffers = 128MB

OLD: #log_destination = 'stderr'
NEW: log_destination = 'stderr'

OLD: #logging_collector = off
NEW: logging_collector = on

OLD: #log_directory = 'pg_log'
NEW: log_directory = '/var/log/pg_log'

OLD: #log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
NEW: log_filename = 'postgresql-%Y-%m-%d.log'

OLD: #log_rotation_age = 1d
NEW: log_rotation_age = 1d

OLD: #log_min_duration_statement = -1
NEW: log_min_duration_statement = 0

OLD: #log_duration = off
NEW: log_duration = on

OLD: #log_line_prefix = ''
NEW: log_line_prefix = '%t [%p]: [%l-1] '

Note that the live and development configs are identical except for
the port number.

Netstat data

# netstat -a
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 localhost:mysql         *:*                     LISTEN
tcp        0      0 *:sunrpc                *:*                     LISTEN
tcp        0      0 *:39571                 *:*                     LISTEN
tcp6       0      0 [::]:http-alt           [::]:*                  LISTEN
tcp6       0      0 [::]:http               [::]:*                  LISTEN
tcp6       0      0 [::]:ssh                [::]:*                  LISTEN
tcp6       0      0 [::]:https              [::]:*                  LISTEN
tcp6       0    732 linode-dev.prehisto:ssh 217.154.203.18:4244     ESTABLISHED
udp        0      0 *:779                   *:*
udp        0      0 *:32781                 *:*
udp        0      0 *:sunrpc                *:*
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags       Type       State         I-Node Path
unix  2      [ ACC ]     STREAM     LISTENING     7294     @/tmp/fam-root-
unix  2      [ ACC ]     STREAM     LISTENING     1745591
/var/run/cgisock.19119
unix  2      [ ]         DGRAM                    179
@/org/kernel/udev/udevd
unix  2      [ ACC ]     STREAM     LISTENING     6073     /dev/log
unix  2      [ ACC ]     STREAM     LISTENING     7275
/var/run/fail2ban/fail2ban.sock
unix  2      [ ACC ]     STREAM     LISTENING     88239
/var/run/mysqld/mysqld.sock
unix  3      [ ]         STREAM     CONNECTED     1799425  /dev/log
unix  3      [ ]         STREAM     CONNECTED     1799424
unix  3      [ ]         STREAM     CONNECTED     1799422
unix  3      [ ]         STREAM     CONNECTED     1799421
unix  3      [ ]         STREAM     CONNECTED     1746483  /dev/log
unix  3      [ ]         STREAM     CONNECTED     1746482
unix  3      [ ]         STREAM     CONNECTED     1746384  /dev/log
unix  3      [ ]         STREAM     CONNECTED     1746382
unix  3      [ ]         STREAM     CONNECTED     7427     /dev/log
unix  3      [ ]         STREAM     CONNECTED     7426
unix  3      [ ]         STREAM     CONNECTED     7298     @/tmp/fam-root-
unix  3      [ ]         STREAM     CONNECTED     7295
unix  3      [ ]         STREAM     CONNECTED     6511     /dev/log
unix  3      [ ]         STREAM     CONNECTED     6508

I'm sure I've had this problem before (a few months ago on my home PC)
and never did solve it.

If anyone can offer some insight I'd be grateful.

Thanks

Thom

Re: Can't restart Postgres

From
Richard Huxton
Date:
Thom Brown wrote:
> Hi,
>
> I've got a development virtual server which matches live exactly
> except for the fact that Postgres is running on a different port

What do you mean by "virtual server"? And does it affect definitions of
localhost or shared-memory allocation?

> which
> is not used by anything else.  Postgres was running fine until I
> updated postgresql.conf to enhance logging and make better use of
> system resources.
>
> Here's the problem:
>
> # /etc/init.d/postgresql-8.3 restart
>  * Stopping PostgreSQL (this can take up to 90 seconds) ...
> pg_ctl: PID file "/var/lib/postgresql/8.3/data/postmaster.pid" does not exist
> Is server running?

Your system isn't set up the way you think it is - the .pid file is
missing. Is it looking in the right place?

> waiting for server to
> start...............................................................could
> not start server                                        [ !! ]
>  * The pid-file doesn't exist but pg_ctl reported a running server.

> If you're curious, the settings I changed in postgresql.conf are as follows:
>
> OLD: shared_buffers = 24MB
> NEW: shared_buffers = 128MB

This can cause problems if your kernel doesn't allocate enough
shared-memory, but you should get a different error message.

> Note that the live and development configs are identical except for
> the port number.

Are they reading the right config files?

> Netstat data
>
> # netstat -a
> Active Internet connections (servers and established)
[snip]

I don't see postgresql here at all. Mysql, fam, fail2ban but not PG.

> If anyone can offer some insight I'd be grateful.

If you've got two installations on the same machine having problems then
either:
1. They're *not* running on different ports with different data
directories (check you're using the correct config file for each)

2. They're having problems with shared memory (in which case you should
see a different error message).

--
  Richard Huxton
  Archonet Ltd

Re: Can't restart Postgres

From
"Serge Fonville"
Date:
Hi,

Did you check permissions?
Do the pid files exist?
What variables are set?

Regards,

Serge Fonville

On Wed, Oct 29, 2008 at 4:43 PM, Thom Brown <thombrown@gmail.com> wrote:
Hi,

I've got a development virtual server which matches live exactly
except for the fact that Postgres is running on a different port which
is not used by anything else.  Postgres was running fine until I
updated postgresql.conf to enhance logging and make better use of
system resources.

Here's the problem:

# /etc/init.d/postgresql-8.3 restart
 * Stopping PostgreSQL (this can take up to 90 seconds) ...
pg_ctl: PID file "/var/lib/postgresql/8.3/data/postmaster.pid" does not exist
Is server running?
 * Some clients did not disconnect within 30 seconds.
 * Going to shutdown the server anyway.
pg_ctl: PID file "/var/lib/postgresql/8.3/data/postmaster.pid" does not exist
Is server running?
 * Shutting down the server gracefully failed.
 * Forcing it to shutdown which leads to a recover-run on next startup.
pg_ctl: PID file "/var/lib/postgresql/8.3/data/postmaster.pid" does not exist
Is server running?
 * Forced shutdown failed!!! Something is wrong with your system,
please take care of it manually.
              [ ok ]
 * Starting PostgreSQL ...
waiting for server to
start...............................................................could
not start server                                        [ !! ]
 * The pid-file doesn't exist but pg_ctl reported a running server.
 * Please check whether there is another server running on the same
port or read the log-file.


If you're curious, the settings I changed in postgresql.conf are as follows:

OLD: shared_buffers = 24MB
NEW: shared_buffers = 128MB

OLD: #log_destination = 'stderr'
NEW: log_destination = 'stderr'

OLD: #logging_collector = off
NEW: logging_collector = on

OLD: #log_directory = 'pg_log'
NEW: log_directory = '/var/log/pg_log'

OLD: #log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
NEW: log_filename = 'postgresql-%Y-%m-%d.log'

OLD: #log_rotation_age = 1d
NEW: log_rotation_age = 1d

OLD: #log_min_duration_statement = -1
NEW: log_min_duration_statement = 0

OLD: #log_duration = off
NEW: log_duration = on

OLD: #log_line_prefix = ''
NEW: log_line_prefix = '%t [%p]: [%l-1] '

Note that the live and development configs are identical except for
the port number.

Netstat data

# netstat -a
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 localhost:mysql         *:*                     LISTEN
tcp        0      0 *:sunrpc                *:*                     LISTEN
tcp        0      0 *:39571                 *:*                     LISTEN
tcp6       0      0 [::]:http-alt           [::]:*                  LISTEN
tcp6       0      0 [::]:http               [::]:*                  LISTEN
tcp6       0      0 [::]:ssh                [::]:*                  LISTEN
tcp6       0      0 [::]:https              [::]:*                  LISTEN
tcp6       0    732 linode-dev.prehisto:ssh 217.154.203.18:4244     ESTABLISHED
udp        0      0 *:779                   *:*
udp        0      0 *:32781                 *:*
udp        0      0 *:sunrpc                *:*
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags       Type       State         I-Node Path
unix  2      [ ACC ]     STREAM     LISTENING     7294     @/tmp/fam-root-
unix  2      [ ACC ]     STREAM     LISTENING     1745591
/var/run/cgisock.19119
unix  2      [ ]         DGRAM                    179
@/org/kernel/udev/udevd
unix  2      [ ACC ]     STREAM     LISTENING     6073     /dev/log
unix  2      [ ACC ]     STREAM     LISTENING     7275
/var/run/fail2ban/fail2ban.sock
unix  2      [ ACC ]     STREAM     LISTENING     88239
/var/run/mysqld/mysqld.sock
unix  3      [ ]         STREAM     CONNECTED     1799425  /dev/log
unix  3      [ ]         STREAM     CONNECTED     1799424
unix  3      [ ]         STREAM     CONNECTED     1799422
unix  3      [ ]         STREAM     CONNECTED     1799421
unix  3      [ ]         STREAM     CONNECTED     1746483  /dev/log
unix  3      [ ]         STREAM     CONNECTED     1746482
unix  3      [ ]         STREAM     CONNECTED     1746384  /dev/log
unix  3      [ ]         STREAM     CONNECTED     1746382
unix  3      [ ]         STREAM     CONNECTED     7427     /dev/log
unix  3      [ ]         STREAM     CONNECTED     7426
unix  3      [ ]         STREAM     CONNECTED     7298     @/tmp/fam-root-
unix  3      [ ]         STREAM     CONNECTED     7295
unix  3      [ ]         STREAM     CONNECTED     6511     /dev/log
unix  3      [ ]         STREAM     CONNECTED     6508

I'm sure I've had this problem before (a few months ago on my home PC)
and never did solve it.

If anyone can offer some insight I'd be grateful.

Thanks

Thom

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Can't restart Postgres

From
"Thom Brown"
Date:
Permissions are identical to live.  I've checked the /tmp folder for a
PID reference, but doesn't exist in live or dev.

What do you mean by "variables"?  How can I check?

I only have one postgresql database cluster on each server.

With regards to the config causing memory problems, the specs of both
environments are absolutely identical.  Same kernel, same disk space,
same distro, same memory.  In fact the development environment had to
be rebuilt so we cloned live and changed the necessary settings.  Is
that potentially an issue, bearing in mind it had been running until
now?

By virtual machine I mean the entire systems are running within
virtual machines on different physical machines.  It is completely
unaware of its host and localhost is a true localhost as far as its
concerned.

The port numbers are only different because we also have a staging
virtual machine which is on the default port and to remotely connect
to both we just changed the port on the development one to an unused
one.  This is the port is has been running on until now.

Thanks

Thom

On Wed, Oct 29, 2008 at 3:57 PM, Serge Fonville
<serge.fonville@gmail.com> wrote:
> Hi,
> Did you check permissions?
> Do the pid files exist?
> What variables are set?
> Regards,
> Serge Fonville
> On Wed, Oct 29, 2008 at 4:43 PM, Thom Brown <thombrown@gmail.com> wrote:
>>
>> Hi,
>>
>> I've got a development virtual server which matches live exactly
>> except for the fact that Postgres is running on a different port which
>> is not used by anything else.  Postgres was running fine until I
>> updated postgresql.conf to enhance logging and make better use of
>> system resources.
>>
>> Here's the problem:
>>
>> # /etc/init.d/postgresql-8.3 restart
>>  * Stopping PostgreSQL (this can take up to 90 seconds) ...
>> pg_ctl: PID file "/var/lib/postgresql/8.3/data/postmaster.pid" does not
>> exist
>> Is server running?
>>  * Some clients did not disconnect within 30 seconds.
>>  * Going to shutdown the server anyway.
>> pg_ctl: PID file "/var/lib/postgresql/8.3/data/postmaster.pid" does not
>> exist
>> Is server running?
>>  * Shutting down the server gracefully failed.
>>  * Forcing it to shutdown which leads to a recover-run on next startup.
>> pg_ctl: PID file "/var/lib/postgresql/8.3/data/postmaster.pid" does not
>> exist
>> Is server running?
>>  * Forced shutdown failed!!! Something is wrong with your system,
>> please take care of it manually.
>>               [ ok ]
>>  * Starting PostgreSQL ...
>> waiting for server to
>> start...............................................................could
>> not start server                                        [ !! ]
>>  * The pid-file doesn't exist but pg_ctl reported a running server.
>>  * Please check whether there is another server running on the same
>> port or read the log-file.
>>
>>
>> If you're curious, the settings I changed in postgresql.conf are as
>> follows:
>>
>> OLD: shared_buffers = 24MB
>> NEW: shared_buffers = 128MB
>>
>> OLD: #log_destination = 'stderr'
>> NEW: log_destination = 'stderr'
>>
>> OLD: #logging_collector = off
>> NEW: logging_collector = on
>>
>> OLD: #log_directory = 'pg_log'
>> NEW: log_directory = '/var/log/pg_log'
>>
>> OLD: #log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
>> NEW: log_filename = 'postgresql-%Y-%m-%d.log'
>>
>> OLD: #log_rotation_age = 1d
>> NEW: log_rotation_age = 1d
>>
>> OLD: #log_min_duration_statement = -1
>> NEW: log_min_duration_statement = 0
>>
>> OLD: #log_duration = off
>> NEW: log_duration = on
>>
>> OLD: #log_line_prefix = ''
>> NEW: log_line_prefix = '%t [%p]: [%l-1] '
>>
>> Note that the live and development configs are identical except for
>> the port number.
>>
>> Netstat data
>>
>> # netstat -a
>> Active Internet connections (servers and established)
>> Proto Recv-Q Send-Q Local Address           Foreign Address         State
>> tcp        0      0 localhost:mysql         *:*                     LISTEN
>> tcp        0      0 *:sunrpc                *:*                     LISTEN
>> tcp        0      0 *:39571                 *:*                     LISTEN
>> tcp6       0      0 [::]:http-alt           [::]:*                  LISTEN
>> tcp6       0      0 [::]:http               [::]:*                  LISTEN
>> tcp6       0      0 [::]:ssh                [::]:*                  LISTEN
>> tcp6       0      0 [::]:https              [::]:*                  LISTEN
>> tcp6       0    732 linode-dev.prehisto:ssh 217.154.203.18:4244
>> ESTABLISHED
>> udp        0      0 *:779                   *:*
>> udp        0      0 *:32781                 *:*
>> udp        0      0 *:sunrpc                *:*
>> Active UNIX domain sockets (servers and established)
>> Proto RefCnt Flags       Type       State         I-Node Path
>> unix  2      [ ACC ]     STREAM     LISTENING     7294     @/tmp/fam-root-
>> unix  2      [ ACC ]     STREAM     LISTENING     1745591
>> /var/run/cgisock.19119
>> unix  2      [ ]         DGRAM                    179
>> @/org/kernel/udev/udevd
>> unix  2      [ ACC ]     STREAM     LISTENING     6073     /dev/log
>> unix  2      [ ACC ]     STREAM     LISTENING     7275
>> /var/run/fail2ban/fail2ban.sock
>> unix  2      [ ACC ]     STREAM     LISTENING     88239
>> /var/run/mysqld/mysqld.sock
>> unix  3      [ ]         STREAM     CONNECTED     1799425  /dev/log
>> unix  3      [ ]         STREAM     CONNECTED     1799424
>> unix  3      [ ]         STREAM     CONNECTED     1799422
>> unix  3      [ ]         STREAM     CONNECTED     1799421
>> unix  3      [ ]         STREAM     CONNECTED     1746483  /dev/log
>> unix  3      [ ]         STREAM     CONNECTED     1746482
>> unix  3      [ ]         STREAM     CONNECTED     1746384  /dev/log
>> unix  3      [ ]         STREAM     CONNECTED     1746382
>> unix  3      [ ]         STREAM     CONNECTED     7427     /dev/log
>> unix  3      [ ]         STREAM     CONNECTED     7426
>> unix  3      [ ]         STREAM     CONNECTED     7298     @/tmp/fam-root-
>> unix  3      [ ]         STREAM     CONNECTED     7295
>> unix  3      [ ]         STREAM     CONNECTED     6511     /dev/log
>> unix  3      [ ]         STREAM     CONNECTED     6508
>>
>> I'm sure I've had this problem before (a few months ago on my home PC)
>> and never did solve it.
>>
>> If anyone can offer some insight I'd be grateful.
>>
>> Thanks
>>
>> Thom
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>
>

Re: Can't restart Postgres

From
"Scott Marlowe"
Date:
On Wed, Oct 29, 2008 at 9:43 AM, Thom Brown <thombrown@gmail.com> wrote:
> Hi,
>  * Forcing it to shutdown which leads to a recover-run on next startup.
> pg_ctl: PID file "/var/lib/postgresql/8.3/data/postmaster.pid" does not exist
> Is server running?
>  * Forced shutdown failed!!! Something is wrong with your system,
> please take care of it manually.
>               [ ok ]
>  * Starting PostgreSQL ...
> waiting for server to
> start...............................................................could
> not start server                                        [ !! ]
>  * The pid-file doesn't exist but pg_ctl reported a running server.
>  * Please check whether there is another server running on the same
> port or read the log-file.

At this point did you do something like:

ps ax|grep postgres

???

Re: Can't restart Postgres

From
"Thom Brown"
Date:
Actually I did "ps aux | grep post" just to cover all bases, but still
nothing.. except of course the grep itself.

On Wed, Oct 29, 2008 at 6:38 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Wed, Oct 29, 2008 at 9:43 AM, Thom Brown <thombrown@gmail.com> wrote:
>> Hi,
>>  * Forcing it to shutdown which leads to a recover-run on next startup.
>> pg_ctl: PID file "/var/lib/postgresql/8.3/data/postmaster.pid" does not exist
>> Is server running?
>>  * Forced shutdown failed!!! Something is wrong with your system,
>> please take care of it manually.
>>               [ ok ]
>>  * Starting PostgreSQL ...
>> waiting for server to
>> start...............................................................could
>> not start server                                        [ !! ]
>>  * The pid-file doesn't exist but pg_ctl reported a running server.
>>  * Please check whether there is another server running on the same
>> port or read the log-file.
>
> At this point did you do something like:
>
> ps ax|grep postgres
>
> ???
>

Re: Can't restart Postgres

From
Tom Lane
Date:
"Thom Brown" <thombrown@gmail.com> writes:
> Actually I did "ps aux | grep post" just to cover all bases, but still
> nothing.. except of course the grep itself.

The overwhelming impression from here is of a seriously brain-dead
startup script.  It's spending all its effort on being chatty and none
on actually dealing with unusual cases correctly :-(.  Whose script
is it anyway?

My bet is that there's some discrepancy between what the script is
expecting and what your intended configuration is.  I'm not sure if
the discrepancy is exactly the PID-file location or if it's more subtle
than that, but anyway I'd suggest reading through that script carefully
to see what it's actually doing.

            regards, tom lane

Re: Can't restart Postgres

From
"Thom Brown"
Date:
I think I must have only done a reload on the live server as now I've
tried to restart the service and I've got exactly the same error, so
it's no longer a discrepancy between environments.

The script is actually one which came with the Gentoo package.  I can
see it is using both $PGOPTS and $PGDATA, neither which are populated
with anything on either server.  I've assigned $PGDATA to the database
cluster path but it still doesn't start.  I've also checked
/etc/conf.d/postgresql-8.3 which contains correct settings.

Okay, so I've manually tried starting the server now and told it to
output any log to /tmp.  This is telling me that the request for a
shared memory segment is higher than my kernel's SHMMAX parameter.  My
bad, I've put my settings in incorrectly, and as it states in the
config file, changes to that setting require a restart.  I've reset
all values to back to how they were and it is running again.  I didn't
think my changes were that demanding, but obviously there were.  I'll
have to look into it more.

Thanks for the suggestions.

Thom

On Thu, Oct 30, 2008 at 2:19 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Thom Brown" <thombrown@gmail.com> writes:
>> Actually I did "ps aux | grep post" just to cover all bases, but still
>> nothing.. except of course the grep itself.
>
> The overwhelming impression from here is of a seriously brain-dead
> startup script.  It's spending all its effort on being chatty and none
> on actually dealing with unusual cases correctly :-(.  Whose script
> is it anyway?
>
> My bet is that there's some discrepancy between what the script is
> expecting and what your intended configuration is.  I'm not sure if
> the discrepancy is exactly the PID-file location or if it's more subtle
> than that, but anyway I'd suggest reading through that script carefully
> to see what it's actually doing.
>
>                        regards, tom lane
>

Re: Can't restart Postgres

From
Thomas
Date:
I myself noticed that if a client is still connected to the DB server,
then PgSQL won't restart. Are you sure all your clients are/were
disconnected? I myself have the DB on remote a virtual machine.

Re: Can't restart Postgres

From
"Thom Brown"
Date:
Well that can't really be the problem since it isn't running when
trying to start.

But yes, I've noticed that before which I actually find very useful.
It's a shame there isn't a way for postgres to broadcast to clients
that it wants to shutdown so things like pgAdmin III will say "Hey,
the server's about to go down.  Do what you need to go and get the
frag outta here!"

On Thu, Oct 30, 2008 at 10:42 AM, Thomas <iamkenzo@gmail.com> wrote:
> I myself noticed that if a client is still connected to the DB server,
> then PgSQL won't restart. Are you sure all your clients are/were
> disconnected? I myself have the DB on remote a virtual machine.
>

Re: Can't restart Postgres

From
Tom Lane
Date:
"Thom Brown" <thombrown@gmail.com> writes:
> The script is actually one which came with the Gentoo package.
> ...
> Okay, so I've manually tried starting the server now and told it to
> output any log to /tmp.  This is telling me that the request for a
> shared memory segment is higher than my kernel's SHMMAX parameter.  My
> bad, I've put my settings in incorrectly, and as it states in the
> config file, changes to that setting require a restart.  I've reset
> all values to back to how they were and it is running again.

Yeah, SHMMAX overrun is a pretty common problem.  You really need to
complain to whoever maintains the Gentoo package that their start script
is so utterly, noisily unhelpful in the presence of a postmaster startup
issue.

            regards, tom lane