Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions |
Date | |
Msg-id | CAA4eK1KbvOAiUnQQxtXAXSQAAroU9h0msp6KXrysYOEVHb34vA@mail.gmail.com Whole thread Raw |
In response to | Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions |
List | pgsql-hackers |
On Fri, Jan 10, 2020 at 10:14 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Mon, Jan 6, 2020 at 2:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > Few more comments: > > -------------------------------- > > v4-0007-Implement-streaming-mode-in-ReorderBuffer > > 1. > > +ReorderBufferStreamTXN(ReorderBuffer *rb, ReorderBufferTXN *txn) > > { > > .. > > + /* > > + * TOCHECK: We have to rebuild historic snapshot to be sure it includes all > > + * information about > > subtransactions, which could arrive after streaming start. > > + */ > > + if (!txn->is_schema_sent) > > + snapshot_now > > = ReorderBufferCopySnap(rb, txn->base_snapshot, > > + txn, > > command_id); > > .. > > } > > > > Why are we using base snapshot here instead of the snapshot we saved > > the first time streaming has happened? And as mentioned in comments, > > won't we need to consider the snapshots for subtransactions that > > arrived after the last time we have streamed the changes? > Fixed > +static void +ReorderBufferStreamTXN(ReorderBuffer *rb, ReorderBufferTXN *txn) { .. + /* + * We can not use txn->snapshot_now directly because after we there + * might be some new sub-transaction which after the last streaming run + * so we need to add those sub-xip in the snapshot. + */ + snapshot_now = ReorderBufferCopySnap(rb, txn->snapshot_now, + txn, command_id); "because after we there", you seem to forget a word between 'we' and 'there'. So as we are copying it now, does this mean it will consider the snapshots for subtransactions that arrived after the last time we have streamed the changes? If so, have you tested it and can we add the same in comments. Also, if we need to copy the snapshot here, then do we need to again copy it in ReorderBufferProcessTXN(in below code and in catch block in the same function). { .. + /* + * Remember the command ID and snapshot if transaction is streaming + * otherwise free the snapshot if we have copied it. + */ + if (streaming) + { + txn->command_id = command_id; + txn->snapshot_now = ReorderBufferCopySnap(rb, snapshot_now, + txn, command_id); + } + else if (snapshot_now->copied) + ReorderBufferFreeSnap(rb, snapshot_now); .. } > > > > 4. In ReorderBufferStreamTXN(), don't we need to set some of the txn > > fields like origin_id, origin_lsn as we do in ReorderBufferCommit() > > especially to cover the case when it gets called due to memory > > overflow (aka via ReorderBufferCheckMemoryLimit). > We get origin_lsn during commit time so I am not sure how can we do > that. I have also noticed that currently, we are not using origin_lsn > on the subscriber side. I think need more investigation that if we > want this then do we need to log it early. > Have you done any investigation of this point? You might want to look at pg_replication_origin* APIs. Today, again looking at this code, I think with current coding, it won't be used even when we encounter commit record. Because ReorderBufferCommit calls ReorderBufferStreamCommit which will make sure that origin_id and origin_lsn is never sent. I think at least that should be fixed, if not, probably, we need a comment with reasoning why we think it is okay not to do in this case. + /* + * If we are streaming the in-progress transaction then Discard the /Discard/discard > > > > v4-0006-Gracefully-handle-concurrent-aborts-of-uncommitte > > 1. > > + /* > > + * If CheckXidAlive is valid, then we check if it aborted. If it did, we > > + * error out > > + */ > > + if (TransactionIdIsValid(CheckXidAlive) && > > + !TransactionIdIsInProgress(CheckXidAlive) && > > + !TransactionIdDidCommit(CheckXidAlive)) > > + ereport(ERROR, > > + (errcode(ERRCODE_TRANSACTION_ROLLBACK), > > + errmsg("transaction aborted during system catalog scan"))); > > > > Why here we can't use TransactionIdDidAbort? If we can't use it, then > > can you add comments stating the reason of the same. > Done + * If CheckXidAlive is valid, then we check if it aborted. If it did, we + * error out. Instead of directly checking the abort status we do check + * if it is not in progress transaction and no committed. Because if there + * were a system crash then status of the the transaction which were running + * at that time might not have marked. So we need to consider them as + * aborted. Refer detailed comments at snapmgr.c where the variable is + * declared. How about replacing the above comment with below one: If CheckXidAlive is valid, then we check if it aborted. If it did, we error out. We can't directly use TransactionIdDidAbort as after crash such transaction might not have been marked as aborted. See detailed comments at snapmgr.c where the variable is declared. I am not able to understand the change in v8-0011-BUGFIX-set-final_lsn-for-subxacts-before-cleanup. Do you have any explanation for the same? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: