Thread: xact_start for walsender & logical decoding not updated
Hi, I think there's a minor bug in pg_stat_activity tracking of walsender processes. The issue is that xact_start is only updated at the very beginning when the walsender starts (so it's almost exactly equal to backend_start) and then just flips between NULL and that value. Reproducing this is trivial - just create a publication/subscription with the built-in logical replication, and run arbitrary workload. You'll see that the xact_start value never changes. I think the right fix is calling SetCurrentStatementStartTimestamp() right before StartTransactionCommand() in ReorderBufferCommit, per the attached patch. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Hi, On 2019-12-10 00:44:09 +0100, Tomas Vondra wrote: > I think there's a minor bug in pg_stat_activity tracking of walsender > processes. The issue is that xact_start is only updated at the very > beginning when the walsender starts (so it's almost exactly equal to > backend_start) and then just flips between NULL and that value. > > Reproducing this is trivial - just create a publication/subscription > with the built-in logical replication, and run arbitrary workload. > You'll see that the xact_start value never changes. > > I think the right fix is calling SetCurrentStatementStartTimestamp() > right before StartTransactionCommand() in ReorderBufferCommit, per the > attached patch. > -- > Tomas Vondra http://www.2ndQuadrant.com > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c > index 53affeb877..5235fb31b8 100644 > --- a/src/backend/replication/logical/reorderbuffer.c > +++ b/src/backend/replication/logical/reorderbuffer.c > @@ -1554,7 +1554,10 @@ ReorderBufferCommit(ReorderBuffer *rb, TransactionId xid, > if (using_subtxn) > BeginInternalSubTransaction("replay"); > else > + { > + SetCurrentStatementStartTimestamp(); > StartTransactionCommand(); > + } I'm quite doubtful this is useful. To me this seems to do nothing but add the overhead of timestamp computation - which isn't always that cheap. I don't think you really can draw meaning from this? Greetings, Andres Freund
At Tue, 10 Dec 2019 00:44:09 +0100, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in > Hi, > > I think there's a minor bug in pg_stat_activity tracking of walsender > processes. The issue is that xact_start is only updated at the very > beginning when the walsender starts (so it's almost exactly equal to > backend_start) and then just flips between NULL and that value. > > Reproducing this is trivial - just create a publication/subscription > with the built-in logical replication, and run arbitrary workload. > You'll see that the xact_start value never changes. > > I think the right fix is calling SetCurrentStatementStartTimestamp() > right before StartTransactionCommand() in ReorderBufferCommit, per the > attached patch. I'm not sure how much xact_start for walsender is useful and we really is not running a statement there. Also autovac launcher starts transaction without a valid statement timestamp perhaps for the same reason. However, if we want to show something meaningful there, I think commit_time might be more informative there. If we use GetCurrentTimestamp(), StartTransaction() already has the same feature for autonomous transactions. I suppose we should do them a unified way. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Mon, Dec 09, 2019 at 04:04:40PM -0800, Andres Freund wrote: >Hi, > >On 2019-12-10 00:44:09 +0100, Tomas Vondra wrote: >> I think there's a minor bug in pg_stat_activity tracking of walsender >> processes. The issue is that xact_start is only updated at the very >> beginning when the walsender starts (so it's almost exactly equal to >> backend_start) and then just flips between NULL and that value. >> >> Reproducing this is trivial - just create a publication/subscription >> with the built-in logical replication, and run arbitrary workload. >> You'll see that the xact_start value never changes. >> >> I think the right fix is calling SetCurrentStatementStartTimestamp() >> right before StartTransactionCommand() in ReorderBufferCommit, per the >> attached patch. > >> -- >> Tomas Vondra http://www.2ndQuadrant.com >> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > >> diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c >> index 53affeb877..5235fb31b8 100644 >> --- a/src/backend/replication/logical/reorderbuffer.c >> +++ b/src/backend/replication/logical/reorderbuffer.c >> @@ -1554,7 +1554,10 @@ ReorderBufferCommit(ReorderBuffer *rb, TransactionId xid, >> if (using_subtxn) >> BeginInternalSubTransaction("replay"); >> else >> + { >> + SetCurrentStatementStartTimestamp(); >> StartTransactionCommand(); >> + } > >I'm quite doubtful this is useful. To me this seems to do nothing but >add the overhead of timestamp computation - which isn't always that >cheap. I don't think you really can draw meaning from this? > I don't want to use this timestamp directly, but it does interfere with monitoring of long-running transactiosn looking at pg_stat_activity. With the current behavior, the walsender entries have ancient timestamps and produce random blips in monitoring. Of course, it's possible to edit the queries to skip entries with backend_type = walsender, but that's a bit inconvenient. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Dec 10, 2019 at 09:42:17AM +0900, Kyotaro Horiguchi wrote: >At Tue, 10 Dec 2019 00:44:09 +0100, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in >> Hi, >> >> I think there's a minor bug in pg_stat_activity tracking of walsender >> processes. The issue is that xact_start is only updated at the very >> beginning when the walsender starts (so it's almost exactly equal to >> backend_start) and then just flips between NULL and that value. >> >> Reproducing this is trivial - just create a publication/subscription >> with the built-in logical replication, and run arbitrary workload. >> You'll see that the xact_start value never changes. >> >> I think the right fix is calling SetCurrentStatementStartTimestamp() >> right before StartTransactionCommand() in ReorderBufferCommit, per the >> attached patch. > >I'm not sure how much xact_start for walsender is useful and we really >is not running a statement there. Also autovac launcher starts >transaction without a valid statement timestamp perhaps for the same >reason. > Maybe, but then maybe we should change it so that we don't report any timestamps for such processes. >However, if we want to show something meaningful there, I think >commit_time might be more informative there. If we use >GetCurrentTimestamp(), StartTransaction() already has the same feature >for autonomous transactions. I suppose we should do them a unified >way. > I don't think so. We have this information from the apply side, and this is really about the *new* transaction started in reorderbuffer. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-Dec-10, Tomas Vondra wrote: > On Tue, Dec 10, 2019 at 09:42:17AM +0900, Kyotaro Horiguchi wrote: > > At Tue, 10 Dec 2019 00:44:09 +0100, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in > > I'm not sure how much xact_start for walsender is useful and we really > > is not running a statement there. Also autovac launcher starts > > transaction without a valid statement timestamp perhaps for the same > > reason. > > Maybe, but then maybe we should change it so that we don't report any > timestamps for such processes. Yeah, I think we should to that. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On 2019-12-10 12:56:56 +0100, Tomas Vondra wrote: > On Mon, Dec 09, 2019 at 04:04:40PM -0800, Andres Freund wrote: > > On 2019-12-10 00:44:09 +0100, Tomas Vondra wrote: > > > I think there's a minor bug in pg_stat_activity tracking of walsender > > > processes. The issue is that xact_start is only updated at the very > > > beginning when the walsender starts (so it's almost exactly equal to > > > backend_start) and then just flips between NULL and that value. > > > > > > Reproducing this is trivial - just create a publication/subscription > > > with the built-in logical replication, and run arbitrary workload. > > > You'll see that the xact_start value never changes. > > > > > > I think the right fix is calling SetCurrentStatementStartTimestamp() > > > right before StartTransactionCommand() in ReorderBufferCommit, per the > > > attached patch. > > > > > diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c > > > index 53affeb877..5235fb31b8 100644 > > > --- a/src/backend/replication/logical/reorderbuffer.c > > > +++ b/src/backend/replication/logical/reorderbuffer.c > > > @@ -1554,7 +1554,10 @@ ReorderBufferCommit(ReorderBuffer *rb, TransactionId xid, > > > if (using_subtxn) > > > BeginInternalSubTransaction("replay"); > > > else > > > + { > > > + SetCurrentStatementStartTimestamp(); > > > StartTransactionCommand(); > > > + } > > > > I'm quite doubtful this is useful. To me this seems to do nothing but > > add the overhead of timestamp computation - which isn't always that > > cheap. I don't think you really can draw meaning from this? > > > > I don't want to use this timestamp directly, but it does interfere with > monitoring of long-running transactiosn looking at pg_stat_activity. > With the current behavior, the walsender entries have ancient timestamps > and produce random blips in monitoring. Of course, it's possible to edit > the queries to skip entries with backend_type = walsender, but that's a > bit inconvenient. Oh, I'm not suggesting that we shouldn't fix this somehow, just that I'm doubtful that that adding a lot of additional SetCurrentStatementStartTimestamp() calls is the right thing. Besides the overhead, it'd also just not be a meaningful value here - neither is it an actual transaction, nor is it the right thing to be monitoring when concerned about bloat or such. It seems like it might be better to instead cause NULL to be returned for the respective column in pg_stat_activity etc? Greetings, Andres Freund
On Wed, 11 Dec 2019 at 02:08, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
On 2019-Dec-10, Tomas Vondra wrote:
> On Tue, Dec 10, 2019 at 09:42:17AM +0900, Kyotaro Horiguchi wrote:
> > At Tue, 10 Dec 2019 00:44:09 +0100, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in
> > I'm not sure how much xact_start for walsender is useful and we really
> > is not running a statement there. Also autovac launcher starts
> > transaction without a valid statement timestamp perhaps for the same
> > reason.
>
> Maybe, but then maybe we should change it so that we don't report any
> timestamps for such processes.
Yeah, I think we should to that.
Agreed. Don't report a transaction start timestamp at all if we're not in a read/write txn in the walsender, which we should never be when using a historic snapshot.
It's not interesting or relevant.
Reporting the commit timestamp of the current or last-processed xact would likely just be fonfusing. I'd rather see that in pg_stat_replication if we're going to show it, that way we can label it usefully.
At Fri, 13 Dec 2019 13:05:41 +0800, Craig Ringer <craig@2ndquadrant.com> wrote in > On Wed, 11 Dec 2019 at 02:08, Alvaro Herrera <alvherre@2ndquadrant.com> > wrote: > > > On 2019-Dec-10, Tomas Vondra wrote: > > > > > On Tue, Dec 10, 2019 at 09:42:17AM +0900, Kyotaro Horiguchi wrote: > > > > At Tue, 10 Dec 2019 00:44:09 +0100, Tomas Vondra < > > tomas.vondra@2ndquadrant.com> wrote in > > > > > > I'm not sure how much xact_start for walsender is useful and we really > > > > is not running a statement there. Also autovac launcher starts > > > > transaction without a valid statement timestamp perhaps for the same > > > > reason. > > > > > > Maybe, but then maybe we should change it so that we don't report any > > > timestamps for such processes. > > > > Yeah, I think we should to that. > > > Agreed. Don't report a transaction start timestamp at all if we're not in a > read/write txn in the walsender, which we should never be when using a > historic snapshot. > > It's not interesting or relevant. > > Reporting the commit timestamp of the current or last-processed xact would > likely just be fonfusing. I'd rather see that in pg_stat_replication if > we're going to show it, that way we can label it usefully. Sounds reasonable. By the way, the starting of this thread is a valid value in xact_timestample for a moment at the starting of logical replication. (I couln't see it unless I inserted a sleep() in IndentifySystem()). I'm not sure but AFAIS it is the only instance in walsendeer. Should we take the trouble to stop that? (I put -1 for it) regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On 2019-Dec-13, Kyotaro Horiguchi wrote: > At Fri, 13 Dec 2019 13:05:41 +0800, Craig Ringer <craig@2ndquadrant.com> wrote in > > On Wed, 11 Dec 2019 at 02:08, Alvaro Herrera <alvherre@2ndquadrant.com> > > wrote: > > > > > On 2019-Dec-10, Tomas Vondra wrote: > > > > > > > On Tue, Dec 10, 2019 at 09:42:17AM +0900, Kyotaro Horiguchi wrote: > > > > > At Tue, 10 Dec 2019 00:44:09 +0100, Tomas Vondra < > > > tomas.vondra@2ndquadrant.com> wrote in > > > > > > > > I'm not sure how much xact_start for walsender is useful and we really > > > > > is not running a statement there. Also autovac launcher starts > > > > > transaction without a valid statement timestamp perhaps for the same > > > > > reason. > > > > > > > > Maybe, but then maybe we should change it so that we don't report any > > > > timestamps for such processes. > > > > > > Yeah, I think we should to that. > > Agreed. Don't report a transaction start timestamp at all if we're not in a > > read/write txn in the walsender, which we should never be when using a > > historic snapshot. > > > > It's not interesting or relevant. This patch changes xact.c to avoid updating transaction start timestamps for walsenders (maybe more commentary is desirable). I think logical decoding is just a special form of walsender and thus it would also be updated by this patch, unless I misunderstood what Tomas explained. > > Reporting the commit timestamp of the current or last-processed xact would > > likely just be confusing. I'd rather see that in pg_stat_replication if > > we're going to show it, that way we can label it usefully. > > Sounds reasonable. Developers interested in this feature can submit a patch, as usual :-) -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Fri, Dec 27, 2019 at 04:46:18PM -0300, Alvaro Herrera wrote: >On 2019-Dec-13, Kyotaro Horiguchi wrote: > >> At Fri, 13 Dec 2019 13:05:41 +0800, Craig Ringer <craig@2ndquadrant.com> wrote in >> > On Wed, 11 Dec 2019 at 02:08, Alvaro Herrera <alvherre@2ndquadrant.com> >> > wrote: >> > >> > > On 2019-Dec-10, Tomas Vondra wrote: >> > > >> > > > On Tue, Dec 10, 2019 at 09:42:17AM +0900, Kyotaro Horiguchi wrote: >> > > > > At Tue, 10 Dec 2019 00:44:09 +0100, Tomas Vondra < >> > > tomas.vondra@2ndquadrant.com> wrote in >> > > >> > > > > I'm not sure how much xact_start for walsender is useful and we really >> > > > > is not running a statement there. Also autovac launcher starts >> > > > > transaction without a valid statement timestamp perhaps for the same >> > > > > reason. >> > > > >> > > > Maybe, but then maybe we should change it so that we don't report any >> > > > timestamps for such processes. >> > > >> > > Yeah, I think we should to that. >> > Agreed. Don't report a transaction start timestamp at all if we're not in a >> > read/write txn in the walsender, which we should never be when using a >> > historic snapshot. >> > >> > It's not interesting or relevant. > >This patch changes xact.c to avoid updating transaction start timestamps >for walsenders (maybe more commentary is desirable). I think logical >decoding is just a special form of walsender and thus it would also be >updated by this patch, unless I misunderstood what Tomas explained. > It's true walsender should not be doing any read-write transactions or executing statements (well, maybe a decoding plugin could, but using historic snapshot). So I agree not leaving xact_start for walsender processes seems OK. >> > Reporting the commit timestamp of the current or last-processed xact would >> > likely just be confusing. I'd rather see that in pg_stat_replication if >> > we're going to show it, that way we can label it usefully. >> >> Sounds reasonable. > >Developers interested in this feature can submit a patch, as usual :-) > ;-) regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-Dec-29, Tomas Vondra wrote: > On Fri, Dec 27, 2019 at 04:46:18PM -0300, Alvaro Herrera wrote: > > > > This patch changes xact.c to avoid updating transaction start timestamps > > for walsenders (maybe more commentary is desirable). I think logical > > decoding is just a special form of walsender and thus it would also be > > updated by this patch, unless I misunderstood what Tomas explained. > > > > It's true walsender should not be doing any read-write transactions or > executing statements (well, maybe a decoding plugin could, but using > historic snapshot). > > So I agree not leaving xact_start for walsender processes seems OK. OK, I pushed my patch to branches 10 - master. (See https://postgr.es/m/20200107211624.GA18974@alvherre.pgsql ) -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > OK, I pushed my patch to branches 10 - master. > (See https://postgr.es/m/20200107211624.GA18974@alvherre.pgsql ) The buildfarm seems less than happy with this. regards, tom lane
I wrote: > The buildfarm seems less than happy with this. ... and, having now looked at the patch, I'm not surprised. Breaking stmtStartTimestamp, which is what you did, seems like an awfully side-effect-filled route to the goal. If you want to prevent monitoring from showing this, why didn't you just prevent monitoring from showing it? That is, I'd have expected some am_walsender logic in or near pgstat.c, not here. regards, tom lane
On 2020-Jan-07, Tom Lane wrote: > I wrote: > > The buildfarm seems less than happy with this. > > ... and, having now looked at the patch, I'm not surprised. > Breaking stmtStartTimestamp, which is what you did, seems like > an awfully side-effect-filled route to the goal. If you want > to prevent monitoring from showing this, why didn't you just > prevent monitoring from showing it? That is, I'd have expected > some am_walsender logic in or near pgstat.c, not here. That seems a pretty simple patch; attached (untested). However, my patch seemed a pretty decent way to achieve the goal, and I don't understand why it causes the failure, or indeed why we care about stmtStartTimestamp at all. I'll look into this again tomorrow. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > On 2020-Jan-07, Tom Lane wrote: >> ... and, having now looked at the patch, I'm not surprised. >> Breaking stmtStartTimestamp, which is what you did, seems like >> an awfully side-effect-filled route to the goal. If you want >> to prevent monitoring from showing this, why didn't you just >> prevent monitoring from showing it? That is, I'd have expected >> some am_walsender logic in or near pgstat.c, not here. > That seems a pretty simple patch; attached (untested). I think you want && not ||, but otherwise that looks about right. > However, my > patch seemed a pretty decent way to achieve the goal, and I don't > understand why it causes the failure, or indeed why we care about > stmtStartTimestamp at all. I'll look into this again tomorrow. I'm not 100% sure why the failure either. The assertion is in code that should only be reached in a parallel worker, and surely walsenders don't launch parallel queries? But it looks to me that all the critters using force_parallel_mode are unhappy. In any case, my larger point is that stmtStartTimestamp is globally accessible state (via GetCurrentStatementStartTimestamp()) and you can have little idea which corners of our code are using it, let alone what extensions might expect about it. Plus it feeds into xactStartTimestamp (cf StartTransaction()), increasing the footprint for unwanted side-effects even more. Redefining its meaning to fix this problem is a really bad idea IMO. regards, tom lane
On 2020-Jan-07, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > On 2020-Jan-07, Tom Lane wrote: > > That seems a pretty simple patch; attached (untested). > > I think you want && not ||, but otherwise that looks about right. Thanks, you were right; pushed. > > However, my > > patch seemed a pretty decent way to achieve the goal, and I don't > > understand why it causes the failure, or indeed why we care about > > stmtStartTimestamp at all. I'll look into this again tomorrow. > > I'm not 100% sure why the failure either. The assertion is in > code that should only be reached in a parallel worker, and surely > walsenders don't launch parallel queries? But it looks to me > that all the critters using force_parallel_mode are unhappy. I reproduced here with force_parallel_mode=regress, and indeed what is happening is that log.rep. subscription walsenders (???) are running queries per commands/subscriptioncmds.c::fetch_table_list(), and under that GUC they beget parallel workers; and because the parent has am_walsender=true then they pass a timestamp of 0 to the children; but the children retain am_walsender=false, so the assertion fires. I didn't spend more time on that, but it seems strange and possibly dangerous, since am_walsender is used to implement some restrictions. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: >> I'm not 100% sure why the failure either. The assertion is in >> code that should only be reached in a parallel worker, and surely >> walsenders don't launch parallel queries? But it looks to me >> that all the critters using force_parallel_mode are unhappy. > I reproduced here with force_parallel_mode=regress, and indeed what is > happening is that log.rep. subscription walsenders (???) are running > queries per commands/subscriptioncmds.c::fetch_table_list(), and under > that GUC they beget parallel workers; and because the parent has > am_walsender=true then they pass a timestamp of 0 to the children; but > the children retain am_walsender=false, so the assertion fires. > I didn't spend more time on that, but it seems strange and possibly > dangerous, since am_walsender is used to implement some restrictions. Indeed. I think it's a truly horrible idea that we are issuing SPI queries inside replication mechanisms. Quite aside from this problem, do we really think that's free of security issues? Or even if you think it is today, can it be kept so? (I've ranted before about keeping a proper layering design in this stuff. Just because it's easier to do stuff by calling a SQL query doesn't mean that we should consider that acceptable.) regards, tom lane