Vacuum full crash - Mailing list pgsql-admin

From Mikko Partio
Subject Vacuum full crash
Date
Msg-id 2ca799770803291340p75d8a4e2o7dfd6c61e73270f6@mail.gmail.com
Whole thread Raw
Responses Re: Vacuum full crash
List pgsql-admin
Hello list

an interrupted vacuum full has just caused a PG instance to restart and recover. Background:

select version();
                                                 version
----------------------------------------------------------------------------------------------------------
 PostgreSQL 8.3.1 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20070626 (Red Hat 4.1.2-14)
(1 row)

I have a largish ( >1TB ) database which is kind of a data warehouse. Recently I had to do some major operations to some of the tables (update all rows in a table etc) which caused major bloat. To remove the bloat, I run VACUUM FULL VERBOSE on the bloated tables. Before the vacuum got finished, I had to abort it due to problems with my own laptop. When I hit ctrl+c to the vacuum, the PG instance went suddenly to recover mode. The logs showed this:

2008-03-29 22:25:15 EET [26841]: [1-1] ERROR:  canceling statement due to user request
2008-03-29 22:25:15 EET [26841]: [2-1] STATEMENT:  vacuum full verbose xyz ;
2008-03-29 22:25:15 EET [26841]: [1-1] ERROR:  canceling statement due to user request
2008-03-29 22:25:15 EET [26841]: [2-1] STATEMENT:  vacuum full verbose xyz ;
2008-03-29 22:25:15 EET [26841]: [3-1] PANIC:  cannot abort transaction 3778747509, it was already committed
2008-03-29 22:25:15 EET [6476]: [4-1] LOG:  server process (PID 26841) was terminated by signal 6: Aborted
2008-03-29 22:25:15 EET [6476]: [5-1] LOG:  terminating any other active server processes
2008-03-29 22:25:15 EET [24814]: [48-1] WARNING:  terminating connection because of crash of another server process
2008-03-29 22:25:15 EET [24814]: [49-1] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2008-03-29 22:25:15 EET [24814]: [50-1] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2008-03-29 22:25:15 EET [26841]: [3-1] PANIC:  cannot abort transaction 3778747509, it was already committed
2008-03-29 22:25:15 EET [6476]: [4-1] LOG:  server process (PID 26841) was terminated by signal 6: Aborted
2008-03-29 22:25:15 EET [6476]: [5-1] LOG:  terminating any other active server processes
2008-03-29 22:25:15 EET [24814]: [48-1] WARNING:  terminating connection because of crash of another server process
2008-03-29 22:25:15 EET [24814]: [49-1] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2008-03-29 22:25:15 EET [24814]: [50-1] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2008-03-29 22:25:15 EET [6476]: [6-1] LOG:  archiver process (PID 6489) exited with exit code 1
2008-03-29 22:25:15 EET [18228]: [1-1] FATAL:  the database system is in recovery mode
2008-03-29 22:25:16 EET [6476]: [7-1] LOG:  all server processes terminated; reinitializing
2008-03-29 22:25:16 EET [18229]: [1-1] LOG:  database system was interrupted; last known up at 2008-03-29 22:20:24 EET
2008-03-29 22:25:16 EET [18229]: [2-1] LOG:  database system was not properly shut down; automatic recovery in progress
2008-03-29 22:25:16 EET [18229]: [3-1] LOG:  redo starts at CB5/16399698
2008-03-29 22:25:22 EET [18229]: [4-1] LOG:  unexpected pageaddr CB4/76FF6000 in log file 3253, segment 35, offset 16736256
2008-03-29 22:25:22 EET [18229]: [5-1] LOG:  redo done at CB5/23FF4C80
2008-03-29 22:25:22 EET [18229]: [6-1] LOG:  last completed transaction was at log time 2008-03-29 22:22:47.931231+02
2008-03-29 22:25:23 EET [18336]: [1-1] FATAL:  the database system is in recovery mode
2008-03-29 22:25:27 EET [18337]: [1-1] FATAL:  the database system is in recovery mode
2008-03-29 22:25:30 EET [18346]: [1-1] FATAL:  the database system is in recovery mode
2008-03-29 22:25:41 EET [18424]: [1-1] FATAL:  the database system is in recovery mode
2008-03-29 22:25:43 EET [18427]: [1-1] LOG:  autovacuum launcher started
2008-03-29 22:25:43 EET [6476]: [8-1] LOG:  database system is ready to accept connections


Seems quite serious to me ("cannot abort a transaction that has already committed"), what can cause such behaviour?

Regards

Mikko

pgsql-admin by date:

Previous
From: Julius Tuskenis
Date:
Subject: pg_get_expr
Next
From: Tom Lane
Date:
Subject: Re: Vacuum full crash