Thread: Important 7.0.* fix to ensure buffers are released
Hiroshi Inoue pointed out that Postgres neglects to do an explicit transaction abort during backend shutdown. For example, in psql begin; declare myc cursor for select * from ..; fetch in myc; \q would cause the backend to exit without having released the resources acquired for the open transaction. This is OK from the point of view of data integrity (other transactions will believe that the transaction was aborted) but not OK if shared resources are left locked up. In particular, this oversight probably accounts for the sporadic reports we've seen of errors like NOTICE: FlushRelationBuffers(all_flows, 500237): block 171439 is referenced (private 0, global 1) FATAL 1: VACUUM (vc_repair_frag): FlushRelationBuffers returned -2 since shared buffer reference counts would not be released by an exiting backend, leading to a complaint (perhaps much later) when VACUUM checks that there are no references to the relation it's trying to vacuum. I have fixed this problem in current sources and back-patched the fix into the 7.0.* branch. But I do not know when or if we'll have a 7.0.3 release, so for anyone who's been annoyed by this problem and doesn't want to wait, the patch for 7.0.* is attached. regards, tom lane *** src/backend/tcop/postgres.c.orig Sat May 20 22:23:30 2000 --- src/backend/tcop/postgres.c Wed Aug 30 16:47:51 2000 *************** *** 1459,1465 **** * Initialize the deferred trigger manager */ if (DeferredTriggerInit() != 0) ! proc_exit(0); SetProcessingMode(NormalProcessing); --- 1459,1465 ---- * Initialize the deferred trigger manager */ if (DeferredTriggerInit() != 0) ! goto normalexit; SetProcessingMode(NormalProcessing); *************** *** 1479,1490 **** TPRINTF(TRACE_VERBOSE, "AbortCurrentTransaction"); AbortCurrentTransaction(); ! InError = false; if (ExitAfterAbort) ! { ! ProcReleaseLocks(); /* Just to be sure... */ ! proc_exit(0); ! } } Warn_restart_ready = true; /* we can now handle elog(ERROR) */ --- 1479,1489 ---- TPRINTF(TRACE_VERBOSE, "AbortCurrentTransaction"); AbortCurrentTransaction(); ! if (ExitAfterAbort) ! goto errorexit; ! ! InError = false; } Warn_restart_ready = true; /* we can now handle elog(ERROR) */ *************** *** 1553,1560 **** if (HandleFunctionRequest() == EOF) { /* lost frontend connection during F message input */ ! pq_close(); ! proc_exit(0); } break; --- 1552,1558 ---- if (HandleFunctionRequest() == EOF) { /* lost frontend connection during F message input */ ! goto normalexit; } break; *************** *** 1608,1618 **** */ case 'X': case EOF: ! if (!IsUnderPostmaster) ! ShutdownXLOG(); ! pq_close(); ! proc_exit(0); ! break; default: elog(ERROR, "unknown frontend message was received"); --- 1606,1612 ---- */ case 'X': case EOF: ! goto normalexit; default: elog(ERROR, "unknown frontend message was received"); *************** *** 1642,1651 **** if (IsUnderPostmaster) NullCommand(Remote); } ! } /* infinite for-loop */ ! proc_exit(0); /* shouldn't get here... */ ! return 1; } #ifndef HAVE_GETRUSAGE --- 1636,1655 ---- if (IsUnderPostmaster) NullCommand(Remote); } ! } /* end of main loop */ ! ! normalexit: ! ExitAfterAbort = true; /* ensure we will exit if elog during abort */ ! AbortOutOfAnyTransaction(); ! if (!IsUnderPostmaster) ! ShutdownXLOG(); ! ! errorexit: ! pq_close(); ! ProcReleaseLocks(); /* Just to be sure... */ ! proc_exit(0); ! return 1; /* keep compiler quiet */ } #ifndef HAVE_GETRUSAGE
> Hiroshi Inoue pointed out that Postgres neglects to do an explicit > transaction abort during backend shutdown. For example, in psql > begin; > declare myc cursor for select * from ..; > fetch in myc; > \q > would cause the backend to exit without having released the resources > acquired for the open transaction. This is OK from the point of view > of data integrity (other transactions will believe that the transaction > was aborted) but not OK if shared resources are left locked up. In > particular, this oversight probably accounts for the sporadic reports > we've seen of errors like > > NOTICE: FlushRelationBuffers(all_flows, 500237): block 171439 is > referenced (private 0, global 1) > FATAL 1: VACUUM (vc_repair_frag): FlushRelationBuffers returned -2 > > since shared buffer reference counts would not be released by an > exiting backend, leading to a complaint (perhaps much later) when > VACUUM checks that there are no references to the relation it's > trying to vacuum. Interesting thing is that 6.5.x does not have the problem. Is it new one for 7.0.x? I remember that you have fixed some refcount leaks in 6.5.x. Could you tell me any examples to demonstrate the cases in 6.5.x, those are supposed to be fixed in 7.0.x? I just want to know what kind of refcount leak problems existing in 6.5.x and 7.0.x. -- Tatsuo Ishii
t-ishii@sra.co.jp writes: > Interesting thing is that 6.5.x does not have the problem. Is it new > one for 7.0.x? I think the bug has been there for a long time. It is easier to see in 7.0.2 because VACUUM will now check for nonzero refcount on *all* pages of the relation. Formerly, it only checked pages that it was about to actually truncate from the relation. So it's possible for an unreleased pin on a page to go unnoticed in 6.5 but generate a complaint in 7.0. Now that I look closely, I see that VACUUM still has a problem with this in current sources: it only calls FlushRelationBuffers() if it needs to shorten the relation. So pinned pages will not be reported unless the file gets shortened by at least one page. This is a bug because it means that pg_upgrade still can't trust VACUUM to ensure that all on-row status bits are correct (see comments for FlushRelationBuffers). I will change it to call FlushRelationBuffers always. > I remember that you have fixed some refcount leaks in 6.5.x. Could you > tell me any examples to demonstrate the cases in 6.5.x, those are > supposed to be fixed in 7.0.x? I think the primary problems had to do with recursive calls to ExecutorRun, which'd invoke the badly broken buffer refcount save/ restore mechanism that was present in 6.5 and earlier. This would mainly be done by SQL and PL functions that do SELECTs. A couple of examples: * elog(ERROR) from inside an SQL function would mean that buffer refcounts held by the outer scan wouldn't be released. So, eg, SELECT sqlfunction(column1) FROM foo; was a buffer leak risk. * SQL functions returning sets could leak even without any elog(), if the entire set result was not read for some reason. There were probably some non-SQL-function cases that got fixed along the way, but I don't have any concrete examples. See the pghacker threads Anyone understand shared buffer refcount mechanism? Progress report: buffer refcount bugs and SQL functions from September 1999 for more info. regards, tom lane
> -----Original Message----- > From: Tom Lane > > t-ishii@sra.co.jp writes: > > Interesting thing is that 6.5.x does not have the problem. Is it new > > one for 7.0.x? > > I think the bug has been there for a long time. It is easier to see One of the reason why we see the bug often in 7.0 seems to be the following change which was applied to temprel.c before 7.0. remove_all_temp_relations() always called AbortOutAnyTransaction() before the change. remove_all_temp_relations() has been called from shmem_exit() and accidentally(I don't think it had been intensional) proc_exit() always called AbortOutAnyTransaction(). @@ -79,6 +79,9 @@ List *l, *next; + if (temp_rels == NIL) + return; + AbortOutOfAnyTransaction(); StartTransactionCommand(); Regards. Hiroshi Inoue
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > One of the reason why we see the bug often in 7.0 seems to be > the following change which was applied to temprel.c before 7.0. > remove_all_temp_relations() always called AbortOutAnyTransaction() > before the change. Bingo! So actually there was an abort-transaction call buried in the shutdown process. I wondered why we didn't see more problems... Anyway, I've added an AbortOutOfAnyTransaction() call to postgres.c, so the behavior should be more straightforward now. regards, tom lane