Severe Badness On My Server: psql: FATAL: the database system is starting up - Mailing list pgsql-admin
From | Mitchell Laks |
---|---|
Subject | Severe Badness On My Server: psql: FATAL: the database system is starting up |
Date | |
Msg-id | 200503131112.01120.mlaks@verizon.net Whole thread Raw |
Responses |
Re: Severe Badness On My Server: psql: FATAL: the database system is starting up
|
List | pgsql-admin |
Dear Gurus: My Server and me have had a very bad weekend, starting Friday afternoon. I am running Debian Sarge, Postgresql 7.4.6 with linux kernel 2.6.8. I am running a Postgresql backed application on a remote server. The system has a system drive, on which the Postgresql database runs and there is a raid 1 drive on which the application stores data. Well, the raid1 failed (or is failing - or is trying its hardest to fail, not clear yet...). This should not have affected the Postgresql database as it is safely on a separate drive. However, when i logged onto the system, I found that I could not turn off postgresql. I logged in as postgres, did pg_ctl stop and it did ....... and then could not stop (presumably because hanging client applications were not loged off the database). So then I killed all the application clients (kill -9 of them), and still I tried to pg_ctl stop and it did not want to stop. So I looked in ps aux and the client applications looked like they were in D status in ps aux. wustl 18232 0.0 0.2 4872 1920 ? D Mar11 0:00 /usr/local/ctn/bi I then tried to reboot system remotely via login as root and shutdown -r now and even shutdown -h now. Interestingly enough (I have never ever seen this - system refused to shutdown!!!!!!!). I was floored! Well what to do? I decided to sleep on it. Well I logged in then on saturday night and system was still hanging in this bizarre state. I now saw qued shutdown requests in the ps aux. And nothing was happening fast. I thought. I read a little. I tried pg_ctl stop -m fast. It did nothing. I prayed. I tried to do pg_dump LTA_IDB >lta_idb.dump to dump the database in question. It didnt do anything. I was desparate. I decided to try desparate measures I then pulled the gun pg_ctl stop -m i. OK so it stopped. Then I said let me try to dump the database and so I did pg_ctl start. It started postgres@A1:~$ pg_ctl status pg_ctl: postmaster is running (PID: 21195) Command line was: /usr/lib/postgresql/bin/postmaster Then I tried to dump the database and i got some message about the fact that Fatal the database was starting. I waited a while and then I tried again. same message. I then tried as user of the database psql LTA_IDB and message Fatal the database is starting. Then I tried psql LTA_IDB and got Fatal database is starting. I waited. Then I did pg_ctl stop (I dont know why i did it. Perversity I think.) It then said to me ................ something about unable to stop. Then I did postgres@A1:~$ pg_dump LTA_IDB>lta_idb.dump 2005-03-13 10:56:33 [21481] LOG: connection received: host=[local] port= 2005-03-13 10:56:33 [21481] FATAL: the database system is shutting down pg_dump: [archiver (db)] connection to database "LTA_IDB" failed: FATAL: the dn Now I did pg_ctl status postgres@A1:~$ pg_ctl status pg_ctl: postmaster is running (PID: 21195) Command line was: /usr/lib/postgresql/bin/postmaster OK I feel like I am in the twilight zone. Next I did as root cd /var/log ls postg* A1:/var/log# ls post* postgres.log postgres.log.2.gz postgres.log.5.gz postgres.log.8.gz postgres.log.1 postgres.log.3.gz postgres.log.6.gz postgres.log.9.gz postgres.log.10.gz postgres.log.4.gz postgres.log.7.gz A1:/var/log# less postgres.log postgres.log: No such file or directory WHAT???????? df -h /dev/sda2 9.2G 2.8G 6.0G 32% / tmpfs 443M 0 443M 0% /dev/shm /dev/sda1 89M 11M 74M 13% /boot /dev/sda3 7.4G 273M 6.7G 4% /home /dev/sda8 11G 33M 9.9G 1% /mirror /dev/sda7 449M 8.1M 417M 2% /tmp /dev/sda6 7.4G 4.7G 2.4G 67% /var /dev/md0 230G 139G 80G 64% /home/big0 I am in the twilight zone. My sanity is suspect. Any ideas on what to do next? Pull the plug???? Mitchell
pgsql-admin by date: