RE: Failed transaction statistics to measure the logical replication progress - Mailing list pgsql-hackers
From | osumi.takamichi@fujitsu.com |
---|---|
Subject | RE: Failed transaction statistics to measure the logical replication progress |
Date | |
Msg-id | TYCPR01MB83732A6796425B20CDC76987ED4A9@TYCPR01MB8373.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | RE: Failed transaction statistics to measure the logical replication progress ("tanghy.fnst@fujitsu.com" <tanghy.fnst@fujitsu.com>) |
Responses |
Re: Failed transaction statistics to measure the logical replication progress
|
List | pgsql-hackers |
On Friday, December 31, 2021 10:12 AM Tang, Haiying/唐 海英 <tanghy.fnst@fujitsu.com> wrote: > On Wednesday, December 22, 2021 6:14 PM osumi.takamichi@fujitsu.com > <osumi.takamichi@fujitsu.com> wrote: > > > > Attached the new patch v19. > > > > Thanks for your patch. I think it's better if you could add this patch to the > commitfest. > Here are some comments: Thank you for your review ! I've created one entry in the next commitfest for this patch [1] > > 1) > + <structfield>commit_count</structfield> <type>bigint</type> > + </para> > + <para> > + Number of transactions successfully applied in this subscription. > + Both COMMIT and COMMIT PREPARED increment this counter. > + </para></entry> > + </row> > ... > > I think the commands (like COMMIT, COMMIT PREPARED ...) can be > surrounded with "<command> </command>", thoughts? Makes sense to me. Fixed. Note that to the user perspective, we should write only COMMIT and COMMIT PREPARED in the documentation. Thus, I don't list up other commands. I wrapped ROLLBACK PREPARED for abort_count as well. > 2) > +extern void pgstat_report_subworker_xact_end(LogicalRepWorker > *repWorker, > + > LogicalRepMsgType command, > + > bool bforce); > > Should "bforce" be "force"? Fixed the typo. > 3) > + * This should be called before the call of process_syning_tables() so > + to not > > "process_syning_tables()" should be "process_syncing_tables()". Fixed. > 4) > +void > +pgstat_send_subworker_xact_stats(LogicalRepWorker *repWorker, bool > +force) { > + static TimestampTz last_report = 0; > + PgStat_MsgSubWorkerXactEnd msg; > + > + if (!force) > + { > ... > + if (!TimestampDifferenceExceeds(last_report, now, > PGSTAT_STAT_INTERVAL)) > + return; > + last_report = now; > + } > + > ... > + if (repWorker->commit_count == 0 && repWorker->abort_count == > 0) > + return; > ... > > I think it's better to check commit_count and abort_count first, then check if > reach PGSTAT_STAT_INTERVAL. > Otherwise if commit_count and abort_count are 0, it is possible that the value > of last_report has been updated but it didn't send stats in fact. In this case, > last_report is not the real time that send last message. Yeah, agreed. This fix is right in terms of the variable name aspect. The only scenario that we can take advantage of the previous implementation of v19's pgstat_send_subworker_xact_stats() should be a case where we execute a bunch of commit-like logical replication apply messages within PGSTAT_STAT_INTERVAL intensively and continuously for long period, because we check "repWorker->commit_count == 0 && repWorker->abort_count == 0" just once before calling pgstat_send() in this case. *But*, this scenario didn't look reasonable. In other words, the way to call TimestampDifferenceExceeds() only if there's any need of update for commit_count/abort_count looks less heavy. Accordingly, I've fixed it as you suggested. Also, I modified some comments in pgstat_send_subworker_xact_stats() for this change. Kindly have a look at the v20 shared in [2]. [1] - https://commitfest.postgresql.org/37/3504/ [2] - https://www.postgresql.org/message-id/TYCPR01MB8373AB2AE1A6EC7B9E012519ED4A9%40TYCPR01MB8373.jpnprd01.prod.outlook.com Best Regards, Takamichi Osumi
pgsql-hackers by date: