Tom Lane wrote:
> Jeff Davis <jdavis@dynworks.com> writes:
>
>> Basically, psql would sit there trying to connect, meanwhile all the
>> attempted connections wouldn't die.
>
>
> More details please? What do you mean exactly by the above statements?
>
>> I had to "kill -9" all the
>> postgres/postmaster processes.
>
>
> Killing individual backends with kill -9 is NOT NOT NOT a recommended
> procedure. In theory you can get away with it but why take risks?
> Use the documented shutdown procedures to give the thing some chance
> of cleaning up after itself.
>
> regards, tom lane
>
>
>
I realize that it is not good to kill the backends, but I tried the
documented SIGTERM to postmaster and it didn't work. SIGKILL was the
only thing that would kill a hanging backend, and once those were down I
could gracefully SIGTERM the postmaster with pg_ctl.
I reasearched the problem a little more (I mentioned above in a reply to
my own message). What would happen is this:
1) I try to make a standard connection to database "A" with psql
2) psql sits there doing nothing for a seemingly infinate amount of time
3) I Ctl-C psql to get back to the shell after several minutes
4) I look at the output of "ps ax| grep post"
and get processes like:
/usr/local/pgsql/bin/postgres [args about how I tried to connect]
as well as the postmaster
5) I try to stop using pg_ctl (seems ok)
6) I try to start with pg_ctl (gives error about a /tmp/.s.xxxx.PGSQLfile)
7) I delete the file, and try again (shared mem errors)
8) I run ipcclean and it seems to eventually work (BTW: I looked at the
script and it seems to check for the output of a "ps ... | grep ..."
command, which sometimes returns the grep process itself, and sometimes
doesn't, so I had to run it until it didn't think the backend was running).
9) I start it successfully
10) same thing happens
If I initdb another location it is fine. Appearently, the only bad
database is database A, the rest can be connected to from my real
location anyway. After I droped/recreated the database, it worked fine
(but no more tables, obviously). The rest of the DBs were unaffected. I
have a .tar of the "bad" DB, if that would help.
I apologize if the is not 100% accurate, but it should be very close to
what happened. I didn't want to take my backend down for another hour to
recreate the problem a third time.
Thanks for any more help,
Jeff Davis