Re: Fix slot's xmin advancement and subxact's lost snapshots in decoding. - Mailing list pgsql-hackers
From | Arseny Sher |
---|---|
Subject | Re: Fix slot's xmin advancement and subxact's lost snapshots in decoding. |
Date | |
Msg-id | 87k1qlve3d.fsf@ars-thinkpad Whole thread Raw |
In response to | Re: Fix slot's xmin advancement and subxact's lost snapshots indecoding. (Alvaro Herrera <alvherre@2ndquadrant.com>) |
Responses |
Re: Fix slot's xmin advancement and subxact's lost snapshots indecoding.
|
List | pgsql-hackers |
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Firstly -- this is top-notch detective work, kudos and thanks for the > patch and test cases. (I verified that both fail before the code fix.) Thank you! > Here's a v3. I applied a lot of makeup in order to try to understand > what's going on. I *think* I have a grasp on the original code and your > bugfix, not terribly firm I admit. Well, when I started digging it, I have found logical decoding code somewhat underdocumented. If you or any other committer will consider the patch trying to improve the comments, I can prepare one. > * you don't need to Assert that things are not NULL if you're > immediately going to dereference them. Indeed. > * I think setting snapshot_base to NULL in ReorderBufferCleanupTXN is > pointless, since the struct is gonna be freed shortly afterwards. Yeah, but it is a general style of the original code, e.g. see /* cosmetic... */ comments. It probably makes the picture a bit more aesthetic and consistent, kind of final chord, though I don't mind removing it. > * I rewrote many comments (both existing and some of the ones your patch > adds), and added lots of comments where there were none. Now the code is nicer, thanks. > * I'm a bit unsure about the comment atop ReorderBufferSetBaseSnapshot. > Obviously, the bit within the #if 0/#endif I'm going to remove before > push. It looks like you've started editing that bit and didn't finish. > I don't understand why it says "Needs to be called before any > changes are added with ReorderBufferQueueChange"; but if you edit that > function and add an assert that the base snapshot is set, it crashes > pretty quickly in the test_decoding tests. (The supposedly bogus > comment was there before your patch -- I'm not saying your comment > addition is faulty.) That's because REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID changes are queued whenever we have read that xact has modified the catalog, regardless of base snapshot existence. Even if there are no changes yet, we will need it for correct visibility of catalog, so I am inclined to remove the assert and comment or rephrase the latter with 'any *data-carrying* changes'. > * I also noticed that we're doing subtxn cleanup one by one in both > ReorderBufferAssignChild and ReorderBufferCommitChild, which means the > top-level txn is sought in the hash table over and over, which seems a > bit silly. Not this patch's problem to fix ... There is already one-entry cache in ReorderBufferTXNByXid. We could add 'don't cache me' flag to it and use it with subxacts, or simply pull top-level reorderbuffer out of loops. I'm fine with the rest of your edits. One more little comment: @@ -543,9 +546,10 @@ ReorderBufferQueueChange(ReorderBuffer *rb, TransactionId xid, XLogRecPtr lsn, ReorderBufferTXN *txn; txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true); + Assert(txn->base_snapshot != NULL || txn->is_known_as_subxact); change->lsn = lsn; - Assert(InvalidXLogRecPtr != lsn); + Assert(!XLogRecPtrIsInvalid(lsn)); dlist_push_tail(&txn->changes, &change->node); txn->nentries++; txn->nentries_mem++; Since we do that, probably we should replace all lsn validity checks with XLogRectPtrIsInvalid? -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
pgsql-hackers by date: