Vacuum full crash - Mailing list pgsql-admin
From | Mikko Partio |
---|---|
Subject | Vacuum full crash |
Date | |
Msg-id | 2ca799770803291340p75d8a4e2o7dfd6c61e73270f6@mail.gmail.com Whole thread Raw |
Responses |
Re: Vacuum full crash
|
List | pgsql-admin |
Hello list
an interrupted vacuum full has just caused a PG instance to restart and recover. Background:
select version();
version
----------------------------------------------------------------------------------------------------------
PostgreSQL 8.3.1 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20070626 (Red Hat 4.1.2-14)
(1 row)
I have a largish ( >1TB ) database which is kind of a data warehouse. Recently I had to do some major operations to some of the tables (update all rows in a table etc) which caused major bloat. To remove the bloat, I run VACUUM FULL VERBOSE on the bloated tables. Before the vacuum got finished, I had to abort it due to problems with my own laptop. When I hit ctrl+c to the vacuum, the PG instance went suddenly to recover mode. The logs showed this:
2008-03-29 22:25:15 EET [26841]: [1-1] ERROR: canceling statement due to user request
2008-03-29 22:25:15 EET [26841]: [2-1] STATEMENT: vacuum full verbose xyz ;
2008-03-29 22:25:15 EET [26841]: [1-1] ERROR: canceling statement due to user request
2008-03-29 22:25:15 EET [26841]: [2-1] STATEMENT: vacuum full verbose xyz ;
2008-03-29 22:25:15 EET [26841]: [3-1] PANIC: cannot abort transaction 3778747509, it was already committed
2008-03-29 22:25:15 EET [6476]: [4-1] LOG: server process (PID 26841) was terminated by signal 6: Aborted
2008-03-29 22:25:15 EET [6476]: [5-1] LOG: terminating any other active server processes
2008-03-29 22:25:15 EET [24814]: [48-1] WARNING: terminating connection because of crash of another server process
2008-03-29 22:25:15 EET [24814]: [49-1] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2008-03-29 22:25:15 EET [24814]: [50-1] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2008-03-29 22:25:15 EET [26841]: [3-1] PANIC: cannot abort transaction 3778747509, it was already committed
2008-03-29 22:25:15 EET [6476]: [4-1] LOG: server process (PID 26841) was terminated by signal 6: Aborted
2008-03-29 22:25:15 EET [6476]: [5-1] LOG: terminating any other active server processes
2008-03-29 22:25:15 EET [24814]: [48-1] WARNING: terminating connection because of crash of another server process
2008-03-29 22:25:15 EET [24814]: [49-1] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2008-03-29 22:25:15 EET [24814]: [50-1] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2008-03-29 22:25:15 EET [6476]: [6-1] LOG: archiver process (PID 6489) exited with exit code 1
2008-03-29 22:25:15 EET [18228]: [1-1] FATAL: the database system is in recovery mode
2008-03-29 22:25:16 EET [6476]: [7-1] LOG: all server processes terminated; reinitializing
2008-03-29 22:25:16 EET [18229]: [1-1] LOG: database system was interrupted; last known up at 2008-03-29 22:20:24 EET
2008-03-29 22:25:16 EET [18229]: [2-1] LOG: database system was not properly shut down; automatic recovery in progress
2008-03-29 22:25:16 EET [18229]: [3-1] LOG: redo starts at CB5/16399698
2008-03-29 22:25:22 EET [18229]: [4-1] LOG: unexpected pageaddr CB4/76FF6000 in log file 3253, segment 35, offset 16736256
2008-03-29 22:25:22 EET [18229]: [5-1] LOG: redo done at CB5/23FF4C80
2008-03-29 22:25:22 EET [18229]: [6-1] LOG: last completed transaction was at log time 2008-03-29 22:22:47.931231+02
2008-03-29 22:25:23 EET [18336]: [1-1] FATAL: the database system is in recovery mode
2008-03-29 22:25:27 EET [18337]: [1-1] FATAL: the database system is in recovery mode
2008-03-29 22:25:30 EET [18346]: [1-1] FATAL: the database system is in recovery mode
2008-03-29 22:25:41 EET [18424]: [1-1] FATAL: the database system is in recovery mode
2008-03-29 22:25:43 EET [18427]: [1-1] LOG: autovacuum launcher started
2008-03-29 22:25:43 EET [6476]: [8-1] LOG: database system is ready to accept connections
Seems quite serious to me ("cannot abort a transaction that has already committed"), what can cause such behaviour?
Regards
Mikko
an interrupted vacuum full has just caused a PG instance to restart and recover. Background:
select version();
version
----------------------------------------------------------------------------------------------------------
PostgreSQL 8.3.1 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20070626 (Red Hat 4.1.2-14)
(1 row)
I have a largish ( >1TB ) database which is kind of a data warehouse. Recently I had to do some major operations to some of the tables (update all rows in a table etc) which caused major bloat. To remove the bloat, I run VACUUM FULL VERBOSE on the bloated tables. Before the vacuum got finished, I had to abort it due to problems with my own laptop. When I hit ctrl+c to the vacuum, the PG instance went suddenly to recover mode. The logs showed this:
2008-03-29 22:25:15 EET [26841]: [1-1] ERROR: canceling statement due to user request
2008-03-29 22:25:15 EET [26841]: [2-1] STATEMENT: vacuum full verbose xyz ;
2008-03-29 22:25:15 EET [26841]: [1-1] ERROR: canceling statement due to user request
2008-03-29 22:25:15 EET [26841]: [2-1] STATEMENT: vacuum full verbose xyz ;
2008-03-29 22:25:15 EET [26841]: [3-1] PANIC: cannot abort transaction 3778747509, it was already committed
2008-03-29 22:25:15 EET [6476]: [4-1] LOG: server process (PID 26841) was terminated by signal 6: Aborted
2008-03-29 22:25:15 EET [6476]: [5-1] LOG: terminating any other active server processes
2008-03-29 22:25:15 EET [24814]: [48-1] WARNING: terminating connection because of crash of another server process
2008-03-29 22:25:15 EET [24814]: [49-1] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2008-03-29 22:25:15 EET [24814]: [50-1] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2008-03-29 22:25:15 EET [26841]: [3-1] PANIC: cannot abort transaction 3778747509, it was already committed
2008-03-29 22:25:15 EET [6476]: [4-1] LOG: server process (PID 26841) was terminated by signal 6: Aborted
2008-03-29 22:25:15 EET [6476]: [5-1] LOG: terminating any other active server processes
2008-03-29 22:25:15 EET [24814]: [48-1] WARNING: terminating connection because of crash of another server process
2008-03-29 22:25:15 EET [24814]: [49-1] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2008-03-29 22:25:15 EET [24814]: [50-1] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2008-03-29 22:25:15 EET [6476]: [6-1] LOG: archiver process (PID 6489) exited with exit code 1
2008-03-29 22:25:15 EET [18228]: [1-1] FATAL: the database system is in recovery mode
2008-03-29 22:25:16 EET [6476]: [7-1] LOG: all server processes terminated; reinitializing
2008-03-29 22:25:16 EET [18229]: [1-1] LOG: database system was interrupted; last known up at 2008-03-29 22:20:24 EET
2008-03-29 22:25:16 EET [18229]: [2-1] LOG: database system was not properly shut down; automatic recovery in progress
2008-03-29 22:25:16 EET [18229]: [3-1] LOG: redo starts at CB5/16399698
2008-03-29 22:25:22 EET [18229]: [4-1] LOG: unexpected pageaddr CB4/76FF6000 in log file 3253, segment 35, offset 16736256
2008-03-29 22:25:22 EET [18229]: [5-1] LOG: redo done at CB5/23FF4C80
2008-03-29 22:25:22 EET [18229]: [6-1] LOG: last completed transaction was at log time 2008-03-29 22:22:47.931231+02
2008-03-29 22:25:23 EET [18336]: [1-1] FATAL: the database system is in recovery mode
2008-03-29 22:25:27 EET [18337]: [1-1] FATAL: the database system is in recovery mode
2008-03-29 22:25:30 EET [18346]: [1-1] FATAL: the database system is in recovery mode
2008-03-29 22:25:41 EET [18424]: [1-1] FATAL: the database system is in recovery mode
2008-03-29 22:25:43 EET [18427]: [1-1] LOG: autovacuum launcher started
2008-03-29 22:25:43 EET [6476]: [8-1] LOG: database system is ready to accept connections
Seems quite serious to me ("cannot abort a transaction that has already committed"), what can cause such behaviour?
Regards
Mikko
pgsql-admin by date: