Re: [HACKERS] WAL logging problem in 9.4.3? - Mailing list pgsql-hackers
From | Noah Misch |
---|---|
Subject | Re: [HACKERS] WAL logging problem in 9.4.3? |
Date | |
Msg-id | 20190818035230.GB3021338@rfd.leadboat.com Whole thread Raw |
In response to | Re: [HACKERS] WAL logging problem in 9.4.3? (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Responses |
Re: [HACKERS] WAL logging problem in 9.4.3?
|
List | pgsql-hackers |
For two-phase commit, PrepareTransaction() needs to execute pending syncs. On Thu, Jul 25, 2019 at 10:39:36AM +0900, Kyotaro Horiguchi wrote: > --- a/src/backend/access/heap/heapam_handler.c > +++ b/src/backend/access/heap/heapam_handler.c > @@ -715,12 +702,6 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap, > /* Remember if it's a system catalog */ > is_system_catalog = IsSystemRelation(OldHeap); > > - /* > - * We need to log the copied data in WAL iff WAL archiving/streaming is > - * enabled AND it's a WAL-logged rel. > - */ > - use_wal = XLogIsNeeded() && RelationNeedsWAL(NewHeap); > - > /* use_wal off requires smgr_targblock be initially invalid */ > Assert(RelationGetTargetBlock(NewHeap) == InvalidBlockNumber); Since you're deleting the use_wal variable, update that last comment. > --- a/src/backend/catalog/storage.c > +++ b/src/backend/catalog/storage.c > @@ -428,21 +450,34 @@ smgrDoPendingDeletes(bool isCommit) > { > SMgrRelation srel; > > - srel = smgropen(pending->relnode, pending->backend); > - > - /* allocate the initial array, or extend it, if needed */ > - if (maxrels == 0) > + if (pending->dosync) > { > - maxrels = 8; > - srels = palloc(sizeof(SMgrRelation) * maxrels); > + /* Perform pending sync of WAL-skipped relation */ > + FlushRelationBuffersWithoutRelcache(pending->relnode, > + false); > + srel = smgropen(pending->relnode, pending->backend); > + smgrimmedsync(srel, MAIN_FORKNUM); This should sync all forks, not just MAIN_FORKNUM. Code that writes WAL for FSM_FORKNUM and VISIBILITYMAP_FORKNUM checks RelationNeedsWAL(). There may be no bug today, but it's conceptually wrong to make RelationNeedsWAL() return false due to this code, use RelationNeedsWAL() for multiple forks, and then not actually sync all forks. The https://postgr.es/m/559FA0BA.3080808@iki.fi design had another component not appearing here. It said, "Instead, at COMMIT, we'd fsync() the relation, or if it's smaller than some threshold, WAL-log the contents of the whole file at that point." Please write the part to WAL-log the contents of small files instead of syncing them. > --- a/src/backend/commands/copy.c > +++ b/src/backend/commands/copy.c > @@ -2725,28 +2722,9 @@ CopyFrom(CopyState cstate) > * If it does commit, we'll have done the table_finish_bulk_insert() at > * the bottom of this routine first. > * > - * As mentioned in comments in utils/rel.h, the in-same-transaction test > - * is not always set correctly, since in rare cases rd_newRelfilenodeSubid > - * can be cleared before the end of the transaction. The exact case is > - * when a relation sets a new relfilenode twice in same transaction, yet > - * the second one fails in an aborted subtransaction, e.g. > - * > - * BEGIN; > - * TRUNCATE t; > - * SAVEPOINT save; > - * TRUNCATE t; > - * ROLLBACK TO save; > - * COPY ... The comment material being deleted is still correct, so don't delete it. Moreover, the code managing rd_firstRelfilenodeSubid has a similar bug. The attached patch adds an assertion that RelationNeedsWAL() and the pendingDeletes array have the same opinion about the relfilenode, and it expands a test case to fail that assertion. > --- a/src/include/utils/rel.h > +++ b/src/include/utils/rel.h > @@ -74,11 +74,13 @@ typedef struct RelationData > SubTransactionId rd_createSubid; /* rel was created in current xact */ > SubTransactionId rd_newRelfilenodeSubid; /* new relfilenode assigned in > * current xact */ > + SubTransactionId rd_firstRelfilenodeSubid; /* new relfilenode assigned > + * first in current xact */ In general, to add a field like this, run "git grep -n 'rd_.*Subid'" and audit all the lines printed. Many bits of code need to look at all three, e.g. RelationClose(). This field needs to be 100% reliable. In other words, it must equal InvalidSubTransactionId if and only if the relfilenode matches the relfilenode that would be in place if the top transaction rolled back. nm
Attachment
pgsql-hackers by date: