Thread: postgres locks...

postgres locks...

From

Jeff Davis

Date:

04 March 2001, 04:20:33

I think this is a bug, but I don't have really enough info to go on for
a report. I was running postgres on my server and everything seemed OK.
Eventually I figured out that some of my websites weren't loading and
connected it to postgres.

Basically, psql would sit there trying to connect, meanwhile all the
attempted connections wouldn't die. I had to "kill -9" all the
postgres/postmaster processes. Then I ran pg_ctl start, and it seemed
OK. However, same problem. So, I killed again, then ran ipcclean (there
were shm errors), the started and it worked. I had to delete a socket
file in /tmp also in order to get it going again.

During the time, I appearently lost several unimportant tables. I am
running 7.0.3/debian-woody. Is this a known problem? I really need to
have a reliable version of postgres running.

Thanks for any advice,
   Jeff Davis

Re: postgres locks...

From

Tom Lane

Date:

04 March 2001, 11:39:22

Jeff Davis <jdavis@dynworks.com> writes:
> Basically, psql would sit there trying to connect, meanwhile all the
> attempted connections wouldn't die.

More details please?  What do you mean exactly by the above statements?

> I had to "kill -9" all the
> postgres/postmaster processes.

Killing individual backends with kill -9 is NOT NOT NOT a recommended
procedure.  In theory you can get away with it but why take risks?
Use the documented shutdown procedures to give the thing some chance
of cleaning up after itself.

            regards, tom lane

Re: postgres locks... [more information]

From

Jeff Davis

Date:

04 March 2001, 19:07:20

Ok, I think I know what happened, kind of..

One of my users' databases was currupt. I actually had to delete the
entire base/<dbname> folder. Then I went in and drop/created the DB
again and it seems to work fine.

Is there any was to know how this happened? I have a .tar of the currupt
DB if someone is interested...

I would hope that 7.1 fixes this issue, any thoughts?

Thanks,
   Jeff Davis


Jeff Davis wrote:

> I think this is a bug, but I don't have really enough info to go on
> for a report. I was running postgres on my server and everything
> seemed OK. Eventually I figured out that some of my websites weren't
> loading and connected it to postgres.
>
> Basically, psql would sit there trying to connect, meanwhile all the
> attempted connections wouldn't die. I had to "kill -9" all the
> postgres/postmaster processes. Then I ran pg_ctl start, and it seemed
> OK. However, same problem. So, I killed again, then ran ipcclean
> (there were shm errors), the started and it worked. I had to delete a
> socket file in /tmp also in order to get it going again.
>
> During the time, I appearently lost several unimportant tables. I am
> running 7.0.3/debian-woody. Is this a known problem? I really need to
> have a reliable version of postgres running.
>
> Thanks for any advice,
>   Jeff Davis
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
>
>

Re: postgres locks...

From

Jeff Davis

Date:

04 March 2001, 21:22:01

Tom Lane wrote:

> Jeff Davis <jdavis@dynworks.com> writes:
>
>> Basically, psql would sit there trying to connect, meanwhile all the
>> attempted connections wouldn't die.
>
>
> More details please?  What do you mean exactly by the above statements?
>
>> I had to "kill -9" all the
>> postgres/postmaster processes.
>
>
> Killing individual backends with kill -9 is NOT NOT NOT a recommended
> procedure.  In theory you can get away with it but why take risks?
> Use the documented shutdown procedures to give the thing some chance
> of cleaning up after itself.
>
>             regards, tom lane
>
>
>
I realize that it is not good to kill the backends, but I tried the
documented SIGTERM to postmaster and it didn't work. SIGKILL was the
only thing that would kill a hanging backend, and once those were down I
could gracefully SIGTERM the postmaster with pg_ctl.

I reasearched the problem a little more (I mentioned above in a reply to
my own message). What would happen is this:
1) I try to make a standard connection to database "A" with psql
2) psql sits there doing nothing for a seemingly infinate amount of time
3) I Ctl-C psql to get back to the shell after several minutes
4) I look at the output of "ps ax| grep post"
and get processes like:
/usr/local/pgsql/bin/postgres [args about how I tried to connect]
as well as the postmaster
5) I try to stop using pg_ctl (seems ok)
6) I try to start with pg_ctl (gives error about a /tmp/.s.xxxx.PGSQLfile)
7) I delete the file, and try again (shared mem errors)
8) I run ipcclean and it seems to eventually work (BTW: I looked at the
script and it seems to check for the output of a "ps ... | grep ..."
command, which sometimes returns the grep process itself, and sometimes
doesn't, so I had to run it until it didn't think the backend was running).
9) I start it successfully
10) same thing happens

If I initdb another location it is fine. Appearently, the only bad
database is database A, the rest can be connected to from my real
location anyway. After I droped/recreated the database, it worked fine
(but no more tables, obviously). The rest of the DBs were unaffected. I
have a .tar of the "bad" DB, if that would help.

I apologize if the is not 100% accurate, but it should be very close to
what happened. I didn't want to take my backend down for another hour to
recreate the problem a third time.

Thanks for any more help,
   Jeff Davis