Thread: Cannot Start Postgres After System Boot

Cannot Start Postgres After System Boot

From
Rich Shepard
Date:
   For reasons I do not understand, the Slackware start-up file for postgres
(/etc/rc.d/rc.postgresql) fails to work properly after I reboot the system.
(Reboots normally occur only after a kernel upgrade or with a hardware
failure that crashes the system.)

   Trying to restart the system manually (su postgres -c 'postgres -D
/var/lib/pgsql/data &') regardless of the presence of /tmp/.s.PGSQL.5432
and /var/lib/pgsql/postmaster.pid. Here's what I see:

[rshepard@salmo ~]$ su postgres -c 'postgres -D /var/lib/pgsql/data &'
Password:
[rshepard@salmo ~]$ LOG:  could not bind IPv4 socket: Address already in use
HINT:  Is another postmaster already running on port 5432? If not, wait a
few seconds and retry.
WARNING:  could not create listen socket for "localhost"
FATAL:  could not create any TCP/IP sockets

   If someone would be kind enough to point out what I'm doing incorrectly
(e.g., removing /tmp/.s.PGSQL.5432 and postmaster.pid when the startup
process complains they're not right) I'll save this information for the next
time. I can also provide the 'start' section of the Slackware init file so I
could learn why it's not working properly.

TIA,

Rich

Re: Cannot Start Postgres After System Boot

From
Andrej
Date:
On 21 October 2010 11:53, Rich Shepard <rshepard@appl-ecosys.com> wrote:
>  If someone would be kind enough to point out what I'm doing incorrectly
> (e.g., removing /tmp/.s.PGSQL.5432 and postmaster.pid when the startup
> process complains they're not right) I'll save this information for the next
> time. I can also provide the 'start' section of the Slackware init file so I
> could learn why it's not working properly.

Please do  - provide the section, I mean.


> TIA,
>
> Rich
Cheers,
Andrej


--
Please don't top post, and don't use HTML e-Mail :}  Make your quotes concise.

http://www.georgedillon.com/web/html_email_is_evil.shtml

Re: Cannot Start Postgres After System Boot

From
Rich Shepard
Date:
On Thu, 21 Oct 2010, Andrej wrote:

> Please do  - provide the section, I mean.

Andrej,

   The entire script is attached. It's only 2588 bytes.

   Also, when there is no postmaster.pid or .s.PGSQL.5432 (and its lock file)
are these recreated automagically when postgres is properly loaded, or do I
need to do something first?

Many thanks,

Rich

Attachment

Re: Cannot Start Postgres After System Boot

From
Tom Lane
Date:
Rich Shepard <rshepard@appl-ecosys.com> writes:
>    The entire script is attached. It's only 2588 bytes.

Personally, I'd drop all the machinations with checking the pidfile or
removing old socket files.  The postmaster is fully capable of doing
those things for itself, and is much less likely to do them mistakenly
than this script is.  In particular, I wonder whether the script's
refusal to start if the pidfile already exists accounts for your
report that it fails to auto-restart after a reboot.

IOW, this:

>         else # remove old socket, if it exists and no daemon is running.

>             if [ ! -f $DATADIR/$PIDFILE ]; then
>                 rm -f /tmp/.s.PGSQL.5432
>                 rm -f /tmp/.s.PGSQL.5432.lock
>                 # pg_ctl start -w -l $LOGFILE -D $DATADIR
>                 su postgres -c 'postgres -D /var/lib/pgsql/data &'
>                 exit 0
>             else
>                 echo "PostgreSQL daemon was not properly shut down"
>                 echo "Please remove stale pid file $DATADIR/$PIDFILE"
>                 exit 7
>             fi

>         fi

could be reduced to just:

        else
            su postgres -c 'postgres -D /var/lib/pgsql/data &'
            exit 0
        fi

I'd also strongly recommend making that be "su - postgres -c ..."
rather than the way it is now; it's failing to ensure that the
postmaster is started with the postgres account's login settings.

I'm not sure about your comment that manual start attempts fail with
    LOG:  could not bind IPv4 socket: Address already in use
It's pretty hard to believe that that could occur on a freshly
booted system unless the TCP port was in fact already in use ---
ie, either there *is* a running postmaster, or something else is
using port 5432.

            regards, tom lane

Re: Cannot Start Postgres After System Boot

From
Andrej
Date:
On 21 October 2010 16:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> could be reduced to just:
>
>                else
>                        su postgres -c 'postgres -D /var/lib/pgsql/data &'
>                        exit 0
>                fi


> I'm not sure about your comment that manual start attempts fail with
>        LOG:  could not bind IPv4 socket: Address already in use
> It's pretty hard to believe that that could occur on a freshly
> booted system unless the TCP port was in fact already in use ---
> ie, either there *is* a running postmaster, or something else is
> using port 5432.

I concur on both accounts; I would like to see the output of the
actual script, though, when it refuses to start; and also a
netstat -anp | grep 5432


Cheers,
Andrej

Re: Cannot Start Postgres After System Boot

From
Reid Thompson
Date:
On 10/20/2010 6:53 PM, Rich Shepard wrote:
>   For reasons I do not understand, the Slackware start-up file for postgres
> (/etc/rc.d/rc.postgresql) fails to work properly after I reboot the system.
> (Reboots normally occur only after a kernel upgrade or with a hardware
> failure that crashes the system.)
>
>   Trying to restart the system manually (su postgres -c 'postgres -D
> /var/lib/pgsql/data &') regardless of the presence of /tmp/.s.PGSQL.5432
> and /var/lib/pgsql/postmaster.pid. Here's what I see:
>
> [rshepard@salmo ~]$ su postgres -c 'postgres -D /var/lib/pgsql/data &'
> Password: [rshepard@salmo ~]$ LOG:  could not bind IPv4 socket: Address already in use
> HINT:  Is another postmaster already running on port 5432? If not, wait a
> few seconds and retry.
> WARNING:  could not create listen socket for "localhost"
> FATAL:  could not create any TCP/IP sockets
>
>   If someone would be kind enough to point out what I'm doing incorrectly
> (e.g., removing /tmp/.s.PGSQL.5432 and postmaster.pid when the startup
> process complains they're not right) I'll save this information for the next
> time. I can also provide the 'start' section of the Slackware init file so I
> could learn why it's not working properly.
>
> TIA,
>
> Rich
>
what does
$ netstat -an|grep 5432
return?

what does
$ ps -ef|grep post
return?

The above indicates that the tcp ipv4 socket is already bound by some process

Re: Cannot Start Postgres After System Boot

From
Scott Marlowe
Date:
On Wed, Oct 20, 2010 at 4:53 PM, Rich Shepard <rshepard@appl-ecosys.com> wrote:
>  For reasons I do not understand, the Slackware start-up file for postgres
> (/etc/rc.d/rc.postgresql) fails to work properly after I reboot the system.
> (Reboots normally occur only after a kernel upgrade or with a hardware
> failure that crashes the system.)
>
>  Trying to restart the system manually (su postgres -c 'postgres -D
> /var/lib/pgsql/data &') regardless of the presence of /tmp/.s.PGSQL.5432
> and /var/lib/pgsql/postmaster.pid. Here's what I see:
>
> [rshepard@salmo ~]$ su postgres -c 'postgres -D /var/lib/pgsql/data &'
> Password: [rshepard@salmo ~]$ LOG:  could not bind IPv4 socket: Address
> already in use
> HINT:  Is another postmaster already running on port 5432? If not, wait a
> few seconds and retry.
> WARNING:  could not create listen socket for "localhost"
> FATAL:  could not create any TCP/IP sockets

Are you sure postgresql isn't getting started by some other init
script before this one runs?  warnings that a port can't be bound to
is usually just that.  something else is on it.  What does lsof tell
you is running on that port?

Re: Cannot Start Postgres After System Boot

From
Rich Shepard
Date:
On Wed, 20 Oct 2010, Tom Lane wrote:

> Personally, I'd drop all the machinations with checking the pidfile or
> removing old socket files.

Tom,

   I didn't write the script; whoever maintains the Slackware package for
PostgreSQL did. Regardless, I'll make the changes you suggest.

> In particular, I wonder whether the script's refusal to start if the
> pidfile already exists accounts for your report that it fails to
> auto-restart after a reboot.

   This clears up my uncertainty. The pidfile should not exist after a clean
shutdown, so it should be removed after a crash, too.

> could be reduced to just:
>
>         else
>             su postgres -c 'postgres -D /var/lib/pgsql/data &'
>             exit 0
>         fi
>
> I'd also strongly recommend making that be "su - postgres -c ..."
> rather than the way it is now; it's failing to ensure that the
> postmaster is started with the postgres account's login settings.

   Done. I wondered about the 'su postgres' because when I run that on the
command line I'm asked for the postgres password. I suppose that since
root's running the init file it's not asked.

> I'm not sure about your comment that manual start attempts fail with
>     LOG:  could not bind IPv4 socket: Address already in use
> It's pretty hard to believe that that could occur on a freshly
> booted system unless the TCP port was in fact already in use ---
> ie, either there *is* a running postmaster, or something else is
> using port 5432.

   I'm not seeing this now, but running the revised script (as root) still
produces this:

Starting PostgreSQL
3753
3755
3756
3757
3758
16481
PostgreSQL daemon already running
Warning: Missing pid file /var/lib/pgsql/data/postmaster.pid

   Yet, when I try to access one of my databases I cannot:

[rshepard@salmo ~]$ psql aesi
psql: could not connect to server: No such file or directory
         Is the server running locally and accepting
         connections on Unix domain socket "/tmp/.s.PGSQL.5432"?

   There was no postgres running before I ran /etc/rc.d/rc.postgresql start.
There is also no socket on /tmp.

   I'd greatly appreciate learning why the startup script is not working so I
can be confident that either the rc.postgresql file or my command line
invocation will consistenly work properly to start the server. I will
provide whatever system information is needed to help diagnose and fix this
problem.

Many thanks,

Rich


Many thanks,

Rich

Re: Cannot Start Postgres After System Boot

From
Scott Marlowe
Date:
On Thu, Oct 21, 2010 at 11:27 AM, Rich Shepard <rshepard@appl-ecosys.com> wrote:
>  Yet, when I try to access one of my databases I cannot:
>
> [rshepard@salmo ~]$ psql aesi
> psql: could not connect to server: No such file or directory
>        Is the server running locally and accepting
>        connections on Unix domain socket "/tmp/.s.PGSQL.5432"?

So, what do

telnet localhost 5432
AND
psql -h localhost -l

do?

Re: Cannot Start Postgres After System Boot

From
Rich Shepard
Date:
On Thu, 21 Oct 2010, Scott Marlowe wrote:

> So, what do
>
> telnet localhost 5432

Scott,

   That port's clear:

[rshepard@salmo ~]$ telnet localhost 5432
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.

> AND
> psql -h localhost -l

   Huh!

[rshepard@salmo ~]$ psql -h localhost -l
          List of databases
    Name    |   Owner    | Encoding
-----------+------------+----------
  aesi      | sql-ledger | LATIN1
  cms       | rshepard   | UTF8
  postgres  | postgres   | UTF8
  refdb     | postgres   | UTF8
  scirefs   | rshepard   | LATIN1
  template0 | postgres   | UTF8
  template1 | postgres   | UTF8
(7 rows)

   So, why can't I connect to a database by entering, for example, 'psql
aesi'?

Thanks,

Rich

Re: Cannot Start Postgres After System Boot

From
Lennin Caro
Date:
--- On Thu, 10/21/10, Reid Thompson <reid.thompson@ateb.com> wrote:

From: Reid Thompson <reid.thompson@ateb.com>
Subject: Re: [GENERAL] Cannot Start Postgres After System Boot
To: "Rich Shepard" <rshepard@appl-ecosys.com>
Cc: pgsql-general@postgresql.org
Date: Thursday, October 21, 2010, 4:28 AM

On 10/20/2010 6:53 PM, Rich Shepard wrote:
>   For reasons I do not understand, the Slackware start-up file for postgres
> (/etc/rc.d/rc.postgresql) fails to work properly after I reboot the system.
> (Reboots normally occur only after a kernel upgrade or with a hardware
> failure that crashes the system.)
>
>   Trying to restart the system manually (su postgres -c 'postgres -D
> /var/lib/pgsql/data &') regardless of the presence of /tmp/.s.PGSQL.5432
> and /var/lib/pgsql/postmaster.pid. Here's what I see:
>
> [rshepard@salmo ~]$ su postgres -c 'postgres -D /var/lib/pgsql/data &'
> Password: [rshepard@salmo ~]$ LOG:  could not bind IPv4 socket: Address already in use
> HINT:  Is another postmaster already running on port 5432? If not, wait a
> few seconds and retry.
> WARNING:  could not create listen socket for "localhost"
> FATAL:  could not create any TCP/IP sockets
>
>   If someone would be kind enough to point out what I'm doing incorrectly
> (e.g., removing /tmp/.s.PGSQL.5432 and postmaster.pid when the startup
> process complains they're not right) I'll save this information for the next
> time. I can also provide the 'start' section of the Slackware init file so I
> could learn why it's not working properly.
>
> TIA,
>
> Rich
>
what does
$ netstat -an|grep 5432
return?

what does
$ ps -ef|grep post
return?

The above indicates that the tcp ipv4 socket is already bound by some process

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Try to delete the files like this

.s.PGSQL.5432
.s.PGSQL.5432.lock
8.x-main.pid

and restart postmaster


Re: Cannot Start Postgres After System Boot

From
Rich Shepard
Date:
On Thu, 21 Oct 2010, Reid Thompson wrote:

> what does
> $ netstat -an|grep 5432
> return?

Reid,

[rshepard@salmo ~]$ netstat -an|grep 5432
tcp        0      0 127.0.0.1:5432          0.0.0.0:*               LISTEN
unix  3      [ ]         STREAM     CONNECTED     785432

> what does
> $ ps -ef|grep post
> return?
> The above indicates that the tcp ipv4 socket is already bound by some process

[rshepard@salmo ~]$ ps -ef|grep post
postgres  3753     1  0 Oct20 ?        00:00:00 postgres -D /var/lib/pgsql/data
postgres  3755  3753  0 Oct20 ?        00:00:00 postgres: writer process
postgres  3756  3753  0 Oct20 ?        00:00:00 postgres: wal writer process
postgres  3757  3753  0 Oct20 ?        00:00:00 postgres: autovacuum launcher process
postgres  3758  3753  0 Oct20 ?        00:00:00 postgres: stats collector process
root      4285     1  0 Oct19 ?        00:00:01 /usr/libexec/postfix/master
postfix   4287  4285  0 Oct19 ?        00:00:00 qmgr -l -t fifo -u
postfix  10143  4285  0 02:15 ?        00:00:00 anvil -l -t unix -u
postfix  16244  4285  0 10:01 ?        00:00:00 smtpd -n smtp -t inet -u -o stress
postfix  16245  4285  0 10:01 ?        00:00:00 trivial-rewrite -n rewrite -t unix -u
postfix  16246  4285  0 10:01 ?        00:00:00 smtpd -n smtp -t inet -u -o stress
postfix  16305  4285  0 10:06 ?        00:00:00 smtpd -n smtp -t inet -u -o stress
postfix  16426  4285  0 10:15 ?        00:00:00 smtpd -n smtp -t inet -u -o stress
postfix  16625  4285  0 10:31 ?        00:00:00 pickup -l -t fifo -u
postfix  16743  4285  0 10:38 ?        00:00:00 cleanup -z -t unix -u
postfix  16744  4285  0 10:38 ?        00:00:00 local -t unix

   Yet I cannot connect to a database either from the command line or, in the
case of SQL-Ledger, from firefox:

Error!

could not connect to server: No such file or directory Is the server running
locally and accepting connections on Unix domain socket
"/tmp/.s.PGSQL.5432"?

Thanks,

Rich

Re: Cannot Start Postgres After System Boot

From
Rich Shepard
Date:
On Thu, 21 Oct 2010, Lennin Caro wrote:

> Try to delete the files like this

> .s.PGSQL.5432
> .s.PGSQL.5432.lock
> 8.x-main.pid

> and restart postmaster

Lennin,

   The sockets are not to be found.

Rich


Re: Cannot Start Postgres After System Boot

From
Adrian Klaver
Date:
On 10/21/2010 10:41 AM, Rich Shepard wrote:
> On Thu, 21 Oct 2010, Reid Thompson wrote:
>
>> what does
>> $ netstat -an|grep 5432
>> return?
>
> Reid,
>
> [rshepard@salmo ~]$ netstat -an|grep 5432
> tcp 0 0 127.0.0.1:5432 0.0.0.0:* LISTEN unix 3 [ ] STREAM CONNECTED 785432
>
>> what does
>> $ ps -ef|grep post
>> return?
>> The above indicates that the tcp ipv4 socket is already bound by some
>> process
>
> [rshepard@salmo ~]$ ps -ef|grep post
> postgres 3753 1 0 Oct20 ? 00:00:00 postgres -D /var/lib/pgsql/data
> postgres 3755 3753 0 Oct20 ? 00:00:00 postgres: writer process postgres
> 3756 3753 0 Oct20 ? 00:00:00 postgres: wal writer process postgres 3757
> 3753 0 Oct20 ? 00:00:00 postgres: autovacuum launcher process postgres
> 3758 3753 0 Oct20 ? 00:00:00 postgres: stats collector process root 4285
> 1 0 Oct19 ? 00:00:01 /usr/libexec/postfix/master
> postfix 4287 4285 0 Oct19 ? 00:00:00 qmgr -l -t fifo -u
> postfix 10143 4285 0 02:15 ? 00:00:00 anvil -l -t unix -u
> postfix 16244 4285 0 10:01 ? 00:00:00 smtpd -n smtp -t inet -u -o stress
> postfix 16245 4285 0 10:01 ? 00:00:00 trivial-rewrite -n rewrite -t unix -u
> postfix 16246 4285 0 10:01 ? 00:00:00 smtpd -n smtp -t inet -u -o stress
> postfix 16305 4285 0 10:06 ? 00:00:00 smtpd -n smtp -t inet -u -o stress
> postfix 16426 4285 0 10:15 ? 00:00:00 smtpd -n smtp -t inet -u -o stress
> postfix 16625 4285 0 10:31 ? 00:00:00 pickup -l -t fifo -u
> postfix 16743 4285 0 10:38 ? 00:00:00 cleanup -z -t unix -u
> postfix 16744 4285 0 10:38 ? 00:00:00 local -t unix
>
> Yet I cannot connect to a database either from the command line or, in the
> case of SQL-Ledger, from firefox:
>
> Error!
>
> could not connect to server: No such file or directory Is the server
> running
> locally and accepting connections on Unix domain socket
> "/tmp/.s.PGSQL.5432"?
>
> Thanks,
>
> Rich
>

What does your postgresql.conf file show for ? :

listen_addresses =

--
Adrian Klaver
adrian.klaver@gmail.com

Re: Cannot Start Postgres After System Boot

From
Rich Shepard
Date:
On Thu, 21 Oct 2010, Adrian Klaver wrote:

> What does your postgresql.conf file show for ? :
> listen_addresses =

Adrian,

#listen_addresses = 'localhost'         # what IP address(es) to listen on;

   This hasn't changed.

Thanks,

Rich

Re: Cannot Start Postgres After System Boot

From
Tom Lane
Date:
Rich Shepard <rshepard@appl-ecosys.com> writes:
> On Wed, 20 Oct 2010, Tom Lane wrote:
>> In particular, I wonder whether the script's refusal to start if the
>> pidfile already exists accounts for your report that it fails to
>> auto-restart after a reboot.

>    This clears up my uncertainty. The pidfile should not exist after a clean
> shutdown, so it should be removed after a crash, too.

Actually, I was saying that the script should *not* concern itself with
the pidfile at all.  Having a script that automatically removes the
pidfile is a big foot-gun: if you ever run it at any time other than
system boot, you'll destroy a critical interlock against starting two
postmasters in the same data directory.  The postmaster is perfectly
capable of getting rid of a stale pidfile by itself, and is far less
likely to do the wrong thing than a scripted removal is.

>    Yet, when I try to access one of my databases I cannot:

> [rshepard@salmo ~]$ psql aesi
> psql: could not connect to server: No such file or directory
>          Is the server running locally and accepting
>          connections on Unix domain socket "/tmp/.s.PGSQL.5432"?

>    There was no postgres running before I ran /etc/rc.d/rc.postgresql start.
> There is also no socket on /tmp.

Hmm, maybe the postmaster thinks it should be putting the socket file
someplace other than /tmp.  Have you got a nondefault setting of
unix_socket_directory in postgresq.conf?  Also, if you're using the
distro's build of postgresql not your own, it's possible that the
compiled-in default for unix_socket_directory isn't /tmp --- though
the copy of libpq you're using seems to think it is /tmp.  Maybe your
libpq came from someplace different than the postmaster executable?

            regards, tom lane

Re: Cannot Start Postgres After System Boot

From
"Reid Thompson"
Date:

On Thu, 2010-10-21 at 10:35 -0700, Rich Shepard wrote:
> On Thu, 21 Oct 2010, Scott Marlowe wrote:
>
> > So, what do
> >
> > telnet localhost 5432
>
> Scott,
>
>    That port's clear:
>
> [rshepard@salmo ~]$ telnet localhost 5432
> Trying 127.0.0.1...
> Connected to localhost.
> Escape character is '^]'.
>
> > AND
> > psql -h localhost -l
>
>    Huh!
>
> [rshepard@salmo ~]$ psql -h localhost -l
>           List of databases
>     Name    |   Owner    | Encoding
> -----------+------------+----------
>   aesi      | sql-ledger | LATIN1
>   cms       | rshepard   | UTF8
>   postgres  | postgres   | UTF8
>   refdb     | postgres   | UTF8
>   scirefs   | rshepard   | LATIN1
>   template0 | postgres   | UTF8
>   template1 | postgres   | UTF8
> (7 rows)
>
>    So, why can't I connect to a database by entering, for example, 'psql
> aesi'?
>
> Thanks,
>
> Rich

what does
$ netstat -an |grep 5432
return?

something is running on tcp port 5432

Re: Cannot Start Postgres After System Boot

From
Scott Marlowe
Date:
On Thu, Oct 21, 2010 at 11:35 AM, Rich Shepard <rshepard@appl-ecosys.com> wrote:
> On Thu, 21 Oct 2010, Scott Marlowe wrote:
>
>> So, what do
>>
>> telnet localhost 5432
>
> Scott,
>
>  That port's clear:
>
> [rshepard@salmo ~]$ telnet localhost 5432
> Trying 127.0.0.1...
> Connected to localhost.
> Escape character is '^]'.

So something IS attached and is answering the phone.

>> AND
>> psql -h localhost -l
>
>  Huh!
>
> [rshepard@salmo ~]$ psql -h localhost -l
>         List of databases

So a postgres IS running on your machine.  I put it to you it's not
running where you think it is.

Re: Cannot Start Postgres After System Boot

From
Scott Marlowe
Date:
On Thu, Oct 21, 2010 at 11:36 AM, Lennin Caro <lennin.caro@yahoo.com> wrote:
>
> Try to delete the files like this
>
> .s.PGSQL.5432
> .s.PGSQL.5432.lock
> 8.x-main.pid
>
> and restart postmaster

WHOA, never delete those files unless you're sure you've killed off
postgres first.  Then and only then you can delete them and safely
restart.  If you ever manage to bring up two postmasters on the same
store you've just destroyed your database.

--
To understand recursion, one must first understand recursion.

Re: Cannot Start Postgres After System Boot

From
Rich Shepard
Date:
On Thu, 21 Oct 2010, Tom Lane wrote:

> Actually, I was saying that the script should *not* concern itself with
> the pidfile at all.

Tom,

   I understood what you wrote.

> Hmm, maybe the postmaster thinks it should be putting the socket file
> someplace other than /tmp. Have you got a nondefault setting of
> unix_socket_directory in postgresq.conf?

   No. It's been commented out forever, so it should be the default.

> Also, if you're using the distro's build of postgresql not your own, it's
> possible that the compiled-in default for unix_socket_directory isn't /tmp
> --- though the copy of libpq you're using seems to think it is /tmp.

   The currently installed 8.3.3 has been running for some time now. I've not
made any changes since last Friday (the last day I used one of the
databases), and the system board failed Sunday afternoon, just after an OS
upgrade.

> Maybe your libpq came from someplace different than the postmaster
> executable?

   I've no idea how that could have happened.

   Since I cannot start the postmaster I cannot run pg_dumpall. What's the
pragmatic way for me to once again get postgres running (and, presumably,
able to cleanly stop and restart when necessary)?

Many thanks,

Rich



Re: Cannot Start Postgres After System Boot

From
Rich Shepard
Date:
On Thu, 21 Oct 2010, Scott Marlowe wrote:

> WHOA, never delete those files unless you're sure you've killed off
> postgres first.  Then and only then you can delete them and safely
> restart.  If you ever manage to bring up two postmasters on the same store
> you've just destroyed your database.

Scott,

   Postgres has not been running. That's the problem I've been trying to
solve. The only reason I've manually killed the socket and its lock is when
the system shut down uncleanly and postgres would not start while they were
present.

Thanks,

Rich


Re: Cannot Start Postgres After System Boot

From
Adrian Klaver
Date:
On 10/21/2010 11:21 AM, Rich Shepard wrote:
> On Thu, 21 Oct 2010, Scott Marlowe wrote:
>
>> WHOA, never delete those files unless you're sure you've killed off
>> postgres first. Then and only then you can delete them and safely
>> restart. If you ever manage to bring up two postmasters on the same store
>> you've just destroyed your database.
>
> Scott,
>
> Postgres has not been running. That's the problem I've been trying to
> solve. The only reason I've manually killed the socket and its lock is when
> the system shut down uncleanly and postgres would not start while they were
> present.
>
> Thanks,
>
> Rich
>
>

But it is running:

rshepard@salmo ~]$ psql -h localhost -l
          List of databases
    Name    |   Owner    | Encoding -----------+------------+----------
  aesi      | sql-ledger | LATIN1
  cms       | rshepard   | UTF8
  postgres  | postgres   | UTF8
  refdb     | postgres   | UTF8
  scirefs   | rshepard   | LATIN1
  template0 | postgres   | UTF8
  template1 | postgres   | UTF8


The missing piece of information seems to be the system board failure.
My guess is that caused corruption.  See if you can connect by doing:

psql -h localhost -d aesi

--
Adrian Klaver
adrian.klaver@gmail.com

Re: Cannot Start Postgres After System Boot

From
Rich Shepard
Date:
On Thu, 21 Oct 2010, Reid Thompson wrote:

> what does
> $ netstat -an |grep 5432
> return?
>
> something is running on tcp port 5432

   Doesn't show that.

[rshepard@salmo ~]$ netstat -an |grep 5432
tcp        0      0 127.0.0.1:5432          0.0.0.0:*               LISTEN

Rich

Re: Cannot Start Postgres After System Boot

From
Rich Shepard
Date:
On Thu, 21 Oct 2010, Scott Marlowe wrote:

> So a postgres IS running on your machine.  I put it to you it's not
> running where you think it is.

   When I run 'ps ax | grep post' I found a few postgres processes. I tried
'/etc/rc.d/rc.postgresql stop' but that had no effect. I killed the lowest
numbered process and that removed them all. However, I still cannot start a
new postgresql process.

Rich

Re: Cannot Start Postgres After System Boot

From
"Reid Thompson"
Date:

On Thu, 2010-10-21 at 11:38 -0700, Rich Shepard wrote:
> On Thu, 21 Oct 2010, Reid Thompson wrote:
>
> > what does
> > $ netstat -an |grep 5432
> > return?
> >
> > something is running on tcp port 5432
>
>    Doesn't show that.
>
> [rshepard@salmo ~]$ netstat -an |grep 5432
> tcp        0      0 127.0.0.1:5432          0.0.0.0:*               LISTEN

The above line means that something is listening on TCP port 5432.
You do NOT have a listener on unix socket port 5432.
EX:  my box has both

$ netstat -an|grep 5432
tcp        0      0 0.0.0.0:5432            0.0.0.0:*               LISTEN    
unix  2      [ ACC ]     STREAM     LISTENING     413260   /var/run/postgresql/.s.PGSQL.5432

If I telnet to
$ telnet localhost 5432

and run
$ netstat -an|grep 5432
tcp        0      0 0.0.0.0:5432            0.0.0.0:*               LISTEN    
tcp        0      0 127.0.0.1:56771         127.0.0.1:5432          ESTABLISHED
tcp        0      0 127.0.0.1:5432          127.0.0.1:56771         ESTABLISHED
unix  2      [ ACC ]     STREAM     LISTENING     413260   /var/run/postgresql/.s.PGSQL.5432
rthompso@raker>~

the established connection is shown
and lsof shows
$ lsof -i TCP:5432
COMMAND   PID     USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
telnet  22648 rthompso    3u  IPv4 445992      0t0  TCP raker.ateb.com:56771->raker.ateb.com:postgresql (ESTABLISHED)
rthompso@raker>~ 
$

Re: Cannot Start Postgres After System Boot

From
"Reid Thompson"
Date:

On Thu, 2010-10-21 at 11:45 -0700, Rich Shepard wrote:
> On Thu, 21 Oct 2010, Scott Marlowe wrote:
>
> > So a postgres IS running on your machine.  I put it to you it's not
> > running where you think it is.
>
>    When I run 'ps ax | grep post' I found a few postgres processes. I tried
> '/etc/rc.d/rc.postgresql stop' but that had no effect. I killed the lowest
> numbered process and that removed them all. However, I still cannot start a
> new postgresql process.
>
> Rich
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



What does
$ su - postgres
$ pg_ctl -D /var/lib/pgsql/data
$ ps -ef|grep post

return

Re: Cannot Start Postgres After System Boot

From
Andrej
Date:
On 22 October 2010 07:45, Rich Shepard <rshepard@appl-ecosys.com> wrote:
>  When I run 'ps ax | grep post' I found a few postgres processes. I tried
> '/etc/rc.d/rc.postgresql stop' but that had no effect. I killed the lowest
> numbered process and that removed them all. However, I still cannot start a
> new postgresql process.
I just stumbled upon your post from two years ago; has
your setup changed since then?


Cheers,
Andrej

--
Please don't top post, and don't use HTML e-Mail :}  Make your quotes concise.

http://www.georgedillon.com/web/html_email_is_evil.shtml

Re: Cannot Start Postgres After System Boot [SOLVED]

From
Rich Shepard
Date:
On Thu, 21 Oct 2010, Adrian Klaver wrote:

> The missing piece of information seems to be the system board failure. My
> guess is that caused corruption.  See if you can connect by doing:
>
> psql -h localhost -d aesi

Adrian,

[rshepard@salmo ~]$ psql -h localhost -d aesi
psql: could not connect to server: Connection refused
         Is the server running on host "localhost" and accepting
         TCP/IP connections on port 5432?

   Let's try something different:

[rshepard@salmo ~]$ su - postgres
Password:
postgres@salmo:~$ postgres -D /var/lib/pgsql/data &
[1] 17910
postgres@salmo:~$ FATAL:  bogus data in lock file "postmaster.pid": ""

   So, I rm postmaster.pid and run again as user postgres and ... it works!

   Now that it's working again, can I assume the problem is with the
rc.postgresql init script not running as root rather than as user postgres?
If that's the case, I need to learn how to effectively su to user postgres
during the boot process so postgresql starts as it should.

   Suggestions, anyone?

   And thanks to all of you for helping me climb out of the hole in which I
was stuck.

   Now I need to re-read how to properly and cleanly upgrade postgres and
move from 8.3.3 to 8.4.5. (I've just posted a question on the CMS MadeSimple
forum asking if there's an issue with 9.0. If not, that's to what I'll
upgrade.)

Much grasses,

Rich


Re: Cannot Start Postgres After System Boot

From
Scott Marlowe
Date:
On Thu, Oct 21, 2010 at 12:38 PM, Rich Shepard <rshepard@appl-ecosys.com> wrote:
> On Thu, 21 Oct 2010, Reid Thompson wrote:
>
>> what does
>> $ netstat -an |grep 5432
>> return?
>>
>> something is running on tcp port 5432
>
>  Doesn't show that.
>
> [rshepard@salmo ~]$ netstat -an |grep 5432
> tcp        0      0 127.0.0.1:5432          0.0.0.0:*               LISTEN

That's exactly what it shows.

Re: Cannot Start Postgres After System Boot

From
Tom Lane
Date:
Rich Shepard <rshepard@appl-ecosys.com> writes:
>    Since I cannot start the postmaster I cannot run pg_dumpall.

As far as I can tell you *are* starting the postmaster, and it is
responding when you query it via TCP (eg, with "psql -h localhost").
What is not working is connections via the Unix socket.  I still
suspect that the problem there is that the postmaster is creating
the socket file somewhere other than /tmp, but your client library
thinks /tmp is where to look.

            regards, tom lane