An enthusiastic person in out content department went and did a silly thing ...
Well, he went and fired off an update that consumed all of the remaining disk space on two runtime servers.
We've fallen back to a hot spare and I am faced with trying to retrieve these machines by Tuesday morning when we
expectsome increase in traffic.
Postgres version is 7.4; the only thing in the /data directory is postgres data and related files:
$ du
3632 ./gex_runtime/base/1
4468 ./gex_runtime/base/17141
0 ./gex_runtime/base/138602992/pgsql_tmp
32682348 ./gex_runtime/base/138602992
32690448 ./gex_runtime/base
340 ./gex_runtime/global
492120 ./gex_runtime/pg_xlog
7660 ./gex_runtime/pg_clog
33190592 ./gex_runtime
0 ./bkup
33190592 .
The log is saying:
HINT: In a moment you should be able to reconnect to the database and repeat your command.
2006-01-01 23:20:19 WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because
anotherserver process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
2006-01-01 23:20:19 LOG: could not close temporary statistics file
"/data/postgres/gex_runtime/global/pgstat.tmp.1413":No space left on device
Availables space is:
$ df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda1 32850580 3137552 28044280 11% /
/dev/sdb1 35001508 33223500 16 100% /data
Any suggestions ? Falling back to the last known state is fine, but just in case I am making a backup of the remaining
databaseto build a replacement.
And yes, I did forsee this and did warn management repeatedly and yet somehow the advice falls on deaf ears. Go figure.
Iguess maybe because it isn't management that a hole kicked in a 3 day weekend.
Greg Williamson
DBA (for now at least)
GlobeXplorer LLC