Re: Failed transaction statistics to measure the logical replication progress - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Failed transaction statistics to measure the logical replication progress
Date
Msg-id CAA4eK1L7qDJj91_aV+vDiJsMBJixt9uuoRiPv4=SxEqNLNLzdQ@mail.gmail.com
Whole thread Raw
In response to RE: Failed transaction statistics to measure the logical replication progress  ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>)
Responses RE: Failed transaction statistics to measure the logical replication progress  ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>)
List pgsql-hackers
On Fri, Feb 18, 2022 at 2:04 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Thursday, February 17, 2022 6:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Thu, Feb 17, 2022 at 3:13 PM Amit Kapila <amit.kapila16@gmail.com>
> > wrote:
> > > Can't we use pgstat_report_stat() here? Basically, you can update xact
> > > completetion counters during apply, and then from
> > > pgstat_report_stat(), you can invoke a logical replication worker
> > > stats-related function.
> > >
> >
> > If we can do this then we can save the logic this patch is trying to introduce for
> > PGSTAT_STAT_INTERVAL.
> Hi, I've encounter a couple of questions during my modification, following your advice.
>
> In the pgstat_report_stat, we refer to the return value of
> GetCurrentTransactionStopTimestamp to compare the time different from the last time.
> (In my previous patch, I used GetCurrentTimestamp)
>
> This time is updated in apply_handle_commit_internal's CommitTransactionCommand for the apply worker.
> Then, if I update the subscription worker stats(commit_count/abort_count) immediately after
> this CommitTransactionCommand and immediately call pgstat_report_stat in the apply_handle_commit_internal,
> the time difference becomes too small (falls short of PGSTAT_STAT_INTERVAL).
> Also, the time of GetCurrentTransactionStopTimestamp is not updated
> even when I keep calling pgstat_report_stat repeatedly.
> Then, IIUC the next possible timing that message of commit_count or abort_count
> is sent to the stats collector would become the time when we execute another transaction
> by the apply worker and update the time for GetCurrentTransactionStopTimestamp
> and rerun pgstat_report_stat again.
>

I think but same is true in the case of the transaction in the backend
where we increment commit counter via AtEOXact_PgStat after updating
the transaction time. After that, we call pgstat_report_stat() at
later point. How is this case different?

> So, if we keep GetCurrentTransactionStopTimestamp without change,
> an update of stats depends on another new subsequent transaction if we simply merge those.
> (this leads to users cannot see the latest stats information ?)
>

I think this should be okay as these don't need to be accurate.

> At least, I got a test failure because of this function for streaming commit case
> because it uses poll_query_until to wait for stats update.
>

I feel it is not a good idea to wait for the accurate update of these counters.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: logical replication empty transactions
Next
From: Florin Irion
Date:
Subject: Re: Emit a warning if the extension's GUC is set incorrectly