Thread: postgres locks...
I think this is a bug, but I don't have really enough info to go on for a report. I was running postgres on my server and everything seemed OK. Eventually I figured out that some of my websites weren't loading and connected it to postgres. Basically, psql would sit there trying to connect, meanwhile all the attempted connections wouldn't die. I had to "kill -9" all the postgres/postmaster processes. Then I ran pg_ctl start, and it seemed OK. However, same problem. So, I killed again, then ran ipcclean (there were shm errors), the started and it worked. I had to delete a socket file in /tmp also in order to get it going again. During the time, I appearently lost several unimportant tables. I am running 7.0.3/debian-woody. Is this a known problem? I really need to have a reliable version of postgres running. Thanks for any advice, Jeff Davis
Jeff Davis <jdavis@dynworks.com> writes: > Basically, psql would sit there trying to connect, meanwhile all the > attempted connections wouldn't die. More details please? What do you mean exactly by the above statements? > I had to "kill -9" all the > postgres/postmaster processes. Killing individual backends with kill -9 is NOT NOT NOT a recommended procedure. In theory you can get away with it but why take risks? Use the documented shutdown procedures to give the thing some chance of cleaning up after itself. regards, tom lane
Ok, I think I know what happened, kind of.. One of my users' databases was currupt. I actually had to delete the entire base/<dbname> folder. Then I went in and drop/created the DB again and it seems to work fine. Is there any was to know how this happened? I have a .tar of the currupt DB if someone is interested... I would hope that 7.1 fixes this issue, any thoughts? Thanks, Jeff Davis Jeff Davis wrote: > I think this is a bug, but I don't have really enough info to go on > for a report. I was running postgres on my server and everything > seemed OK. Eventually I figured out that some of my websites weren't > loading and connected it to postgres. > > Basically, psql would sit there trying to connect, meanwhile all the > attempted connections wouldn't die. I had to "kill -9" all the > postgres/postmaster processes. Then I ran pg_ctl start, and it seemed > OK. However, same problem. So, I killed again, then ran ipcclean > (there were shm errors), the started and it worked. I had to delete a > socket file in /tmp also in order to get it going again. > > During the time, I appearently lost several unimportant tables. I am > running 7.0.3/debian-woody. Is this a known problem? I really need to > have a reliable version of postgres running. > > Thanks for any advice, > Jeff Davis > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > > >
Tom Lane wrote: > Jeff Davis <jdavis@dynworks.com> writes: > >> Basically, psql would sit there trying to connect, meanwhile all the >> attempted connections wouldn't die. > > > More details please? What do you mean exactly by the above statements? > >> I had to "kill -9" all the >> postgres/postmaster processes. > > > Killing individual backends with kill -9 is NOT NOT NOT a recommended > procedure. In theory you can get away with it but why take risks? > Use the documented shutdown procedures to give the thing some chance > of cleaning up after itself. > > regards, tom lane > > > I realize that it is not good to kill the backends, but I tried the documented SIGTERM to postmaster and it didn't work. SIGKILL was the only thing that would kill a hanging backend, and once those were down I could gracefully SIGTERM the postmaster with pg_ctl. I reasearched the problem a little more (I mentioned above in a reply to my own message). What would happen is this: 1) I try to make a standard connection to database "A" with psql 2) psql sits there doing nothing for a seemingly infinate amount of time 3) I Ctl-C psql to get back to the shell after several minutes 4) I look at the output of "ps ax| grep post" and get processes like: /usr/local/pgsql/bin/postgres [args about how I tried to connect] as well as the postmaster 5) I try to stop using pg_ctl (seems ok) 6) I try to start with pg_ctl (gives error about a /tmp/.s.xxxx.PGSQLfile) 7) I delete the file, and try again (shared mem errors) 8) I run ipcclean and it seems to eventually work (BTW: I looked at the script and it seems to check for the output of a "ps ... | grep ..." command, which sometimes returns the grep process itself, and sometimes doesn't, so I had to run it until it didn't think the backend was running). 9) I start it successfully 10) same thing happens If I initdb another location it is fine. Appearently, the only bad database is database A, the rest can be connected to from my real location anyway. After I droped/recreated the database, it worked fine (but no more tables, obviously). The rest of the DBs were unaffected. I have a .tar of the "bad" DB, if that would help. I apologize if the is not 100% accurate, but it should be very close to what happened. I didn't want to take my backend down for another hour to recreate the problem a third time. Thanks for any more help, Jeff Davis