Race-condition with failed block-write? - Mailing list pgsql-bugs
From | Arjen van der Meijden |
---|---|
Subject | Race-condition with failed block-write? |
Date | |
Msg-id | 43214D5F.9080502@tweakers.net Whole thread Raw |
Responses |
Re: Race-condition with failed block-write?
Re: Race-condition with failed block-write? |
List | pgsql-bugs |
Hi list, I noticed a development machine was responding terribly slow, and found out postgresql had consumed all available memory: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres 7465 0.0 0.0 16636 928 ? S Sep05 0:00 /usr/bin/postmaster -D /var/lib/postgresql/data postgres 7472 0.0 81.0 3144408 832032 ? D Sep05 0:56 \_ postgres: writer process postgres 7473 0.0 0.0 7552 916 ? S Sep05 0:00 \_ postgres: stats buffer process postgres 7474 0.0 0.0 6848 900 ? S Sep05 0:00 \_ postgres: stats collector process It had taken over 800MB of memory. The machine is a P4 3.0 Ghz (with HT enabled), 1GB of memory a few SATA disks and runs with a recent Gentoo + 2.6.12.5 kernel. The postgresql version that failed was 8.0.3, it may or may not be worth knowing that it runs a 8.1devel on another port and from another directory. The 8.0.3-database was not actively in use, since our current development work was using the 8.1devel. In the postgresql.log a write-failure messages was repeated enough to make the log file 50MB larger: This was the first one: [ - 2005-09-07 13:03:47 CEST @] ERROR: xlog flush request 29/67713428 is not satisfied --- flushed only to 29/2E73E488 [ - 2005-09-07 13:03:47 CEST @] CONTEXT: writing block 21 of relation 1663/2013826/9975789 [ - 2005-09-07 13:03:48 CEST @] ERROR: xlog flush request 29/67713428 is not satisfied --- flushed only to 29/2E73E488 [ - 2005-09-07 13:03:48 CEST @] CONTEXT: writing block 21 of relation 1663/2013826/9975789 [ - 2005-09-07 13:03:48 CEST @] WARNING: could not write block 21 of 1663/2013826/9975789 [ - 2005-09-07 13:03:48 CEST @] DETAIL: Multiple failures --- write error may be permanent. And those 4 lines were repeated up untill the memory was empty, as it seems from the log: TopMemoryContext: -1095880208 total in 264213 blocks; 537938888 free (924739 chunks); -1633819096 used MdSmgr: 8192 total in 1 blocks; 8024 free (0 chunks); 168 used Pending Ops Table: 8192 total in 1 blocks; 6112 free (0 chunks); 2080 used DynaHash: 8192 total in 1 blocks; 7488 free (0 chunks); 704 used smgr relation table: 8192 total in 1 blocks; 4048 free (0 chunks); 4144 used LockTable (locallock hash): 8192 total in 1 blocks; 6112 free (0 chunks); 2080 used ErrorContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used [ - 2005-09-09 02:42:22 CEST @] ERROR: out of memory [ - 2005-09-09 02:42:22 CEST @] DETAIL: Failed on request of size 16000. After issueing an immediate-shutdown, it starts and runs just fine: [ - 2005-09-09 10:19:51 CEST @] LOG: received fast shutdown request [ - 2005-09-09 10:23:01 CEST @] LOG: received immediate shutdown request [ - 2005-09-09 10:27:20 CEST @] LOG: database system was interrupted at 2005-09-05 11:24:08 CEST [ - 2005-09-09 10:27:20 CEST @] LOG: checkpoint record is at 29/2E73E44C [ - 2005-09-09 10:27:20 CEST @] LOG: redo record is at 29/2E73E44C; undo record is at 0/0; shutdown TRUE [ - 2005-09-09 10:27:20 CEST @] LOG: next transaction ID: 166361; next OID: 55768251 [ - 2005-09-09 10:27:20 CEST @] LOG: database system was not properly shut down; automatic recovery in progress [ - 2005-09-09 10:27:20 CEST @] LOG: record with zero length at 29/2E73E488 [ - 2005-09-09 10:27:20 CEST @] LOG: redo is not required [ - 2005-09-09 10:27:20 CEST @] LOG: database system is ready At 2005-09-05 the machine was rebooted and the only query since then (which may have triggered the issue) was: [2005-09-07 13:03:35 CEST - 2005-09-07 13:03:37 CEST acm@acm] LOG: statement: SELECT c.oid, n.nspname, c.relname FROM pg_catalog.pg_class c LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace WHERE pg_catalog.pg_table_is_visible(c.oid) AND c.relname ~ '^pwprijs$' ORDER BY 2, 3; I shut down the postgresql-server after the restart again to prevent further changes to the data-files of it. Best regards, Arjen van der Meijden
pgsql-bugs by date: