Thread: pg_subtrans keeps bloating up in the standby
Hi, I received the off-list email reporting that pg_subtrans keeps bloating up in the standby, from Harald (Thanks!). I investigated this issue and found that the standby doesn't truncate pg_subtrans at all even though HS keeps extending it. In the master, a checkpoint calls TruncateSUBTRANS() and truncate old pg_subtrans entries, but in the standby, a restartpoint doesn't do that. And I found the following comment in CreateRestartPoint(): /* * Currently, there is no need to truncate pg_subtrans during recovery. If * we did do that, we will need to have calledStartupSUBTRANS() already * and then TruncateSUBTRANS() would go here. */ I'm not sure why there is no need to truncate pg_subtrans during recovery. To fix the issue, we should make a restartpoint call TruncateSUBTRANS(). Thought? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On 27/08/10 16:39, Fujii Masao wrote: > I received the off-list email reporting that pg_subtrans keeps bloating up > in the standby, from Harald (Thanks!). I investigated this issue and found > that the standby doesn't truncate pg_subtrans at all even though HS keeps > extending it. In the master, a checkpoint calls TruncateSUBTRANS() and > truncate old pg_subtrans entries, but in the standby, a restartpoint doesn't > do that. And I found the following comment in CreateRestartPoint(): > > /* > * Currently, there is no need to truncate pg_subtrans during recovery. If > * we did do that, we will need to have called StartupSUBTRANS() already > * and then TruncateSUBTRANS() would go here. > */ > > I'm not sure why there is no need to truncate pg_subtrans during recovery. > To fix the issue, we should make a restartpoint call TruncateSUBTRANS(). > Thought? Hmm, agreed, seems like an oversight in hot standby. Before that, we didn't update pg_subtrans during recovery, so there was no point truncating it. But in hot standby, we do update it, so we need to truncate it too. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Fri, Aug 27, 2010 at 11:25 PM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > Hmm, agreed, seems like an oversight in hot standby. Before that, we didn't > update pg_subtrans during recovery, so there was no point truncating it. But > in hot standby, we do update it, so we need to truncate it too. Yes. The attached patch changes a restartpoint so that it truncates pg_subtrans when hot standby is enabled. When it's disabled, we need to do nothing because, in that case, pg_subtrans is not updated during recovery and StartupSUBTRANS() is not called before a restartpoint. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Attachment
Fujii Masao <masao.fujii@gmail.com> writes: > On Fri, Aug 27, 2010 at 11:25 PM, Heikki Linnakangas > <heikki.linnakangas@enterprisedb.com> wrote: >> Hmm, agreed, seems like an oversight in hot standby. Before that, we didn't >> update pg_subtrans during recovery, so there was no point truncating it. But >> in hot standby, we do update it, so we need to truncate it too. > Yes. The attached patch changes a restartpoint so that it truncates pg_subtrans > when hot standby is enabled. Has StartupSUBTRANS been done? regards, tom lane
On Sat, Aug 28, 2010 at 2:08 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Fujii Masao <masao.fujii@gmail.com> writes: >> On Fri, Aug 27, 2010 at 11:25 PM, Heikki Linnakangas >> <heikki.linnakangas@enterprisedb.com> wrote: >>> Hmm, agreed, seems like an oversight in hot standby. Before that, we didn't >>> update pg_subtrans during recovery, so there was no point truncating it. But >>> in hot standby, we do update it, so we need to truncate it too. > >> Yes. The attached patch changes a restartpoint so that it truncates pg_subtrans >> when hot standby is enabled. > > Has StartupSUBTRANS been done? Yes. StartupXLOG calls that before bgwriter is invoked. That is, we can ensure that StartupSUBTRANS has always been done before bgwriter performs a restartpoint. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On 8/27/10 10:17 AM, Fujii Masao wrote: > On Sat, Aug 28, 2010 at 2:08 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Fujii Masao <masao.fujii@gmail.com> writes: >>> On Fri, Aug 27, 2010 at 11:25 PM, Heikki Linnakangas >>> <heikki.linnakangas@enterprisedb.com> wrote: >>>> Hmm, agreed, seems like an oversight in hot standby. Before that, we didn't >>>> update pg_subtrans during recovery, so there was no point truncating it. But >>>> in hot standby, we do update it, so we need to truncate it too. >>> Yes. The attached patch changes a restartpoint so that it truncates pg_subtrans >>> when hot standby is enabled. >> Has StartupSUBTRANS been done? > > Yes. StartupXLOG calls that before bgwriter is invoked. That is, we can > ensure that StartupSUBTRANS has always been done before bgwriter > performs a restartpoint. Should we hold off on RC1 for this issue? -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com
Josh Berkus <josh@agliodbs.com> writes: > Should we hold off on RC1 for this issue? rc1 was already wrapped yesterday. regards, tom lane
On 27/08/10 20:17, Fujii Masao wrote: > Yes. StartupXLOG calls that before bgwriter is invoked. That is, we can > ensure that StartupSUBTRANS has always been done before bgwriter > performs a restartpoint. Hmm, the comment in CreateCheckpoint() isn't totally accurate either: > * Truncate pg_subtrans if possible. We can throw away all data before > * the oldest XMIN of any running transaction. No future transaction will > * attempt to reference any pg_subtrans entry older than that (see Asserts > * in subtrans.c). During recovery, though, we mustn't do this because > * StartupSUBTRANS hasn't been called yet. because in Hot Standby mode, StartSUBTRANS has been called already. We could truncate pg_subtrans there too when hot standby is enabled. But this is only about the startup checkpoint at the end of recovery, so I'm inclined to not change that, not right now just before release anyway, just in case we're missing something... However, is it safe to use GetOldestXMin() during recovery? Or to put it other way, is GetOldestXMin() functioning correctly during hot standby? It only scans through the ProcArray, but not the known-assigned xids array. That seems like an oversight that needs to be fixed. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Mon, 2010-08-30 at 09:59 +0300, Heikki Linnakangas wrote: > On 27/08/10 20:17, Fujii Masao wrote: > > Yes. StartupXLOG calls that before bgwriter is invoked. That is, we can > > ensure that StartupSUBTRANS has always been done before bgwriter > > performs a restartpoint. > > Hmm, the comment in CreateCheckpoint() isn't totally accurate either: > > > * Truncate pg_subtrans if possible. We can throw away all data before > > * the oldest XMIN of any running transaction. No future transaction will > > * attempt to reference any pg_subtrans entry older than that (see Asserts > > * in subtrans.c). During recovery, though, we mustn't do this because > > * StartupSUBTRANS hasn't been called yet. Yep. > because in Hot Standby mode, StartSUBTRANS has been called already. We > could truncate pg_subtrans there too when hot standby is enabled. But > this is only about the startup checkpoint at the end of recovery, so I'm > inclined to not change that, not right now just before release anyway, > just in case we're missing something... > > However, is it safe to use GetOldestXMin() during recovery? Or to put it > other way, is GetOldestXMin() functioning correctly during hot standby? > It only scans through the ProcArray, but not the known-assigned xids > array. That seems like an oversight that needs to be fixed. Yes, thats correct. Otherwise the patch is fine. I'm working on this now and will commit something shortly. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services
On Mon, 2010-08-30 at 09:59 +0300, Heikki Linnakangas wrote: > However, is it safe to use GetOldestXMin() during recovery? Or to put it > other way, is GetOldestXMin() functioning correctly during hot standby? > It only scans through the ProcArray, but not the known-assigned xids > array. That seems like an oversight that needs to be fixed. Patch to implement that attached. This is necessary since CreateCheckpoint is called during end of recovery, though at that point there are still xids in KnownAssignedXids since they aren't removed until slightly later. Not hugely important. Also allows GetOldestXmin to be safely called elsewhere, such as Fujii's earlier patch on this thread. Any objections to commit to both head and 9.0? Will then commit Fujii's patch. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services
Attachment
On Mon, Aug 30, 2010 at 6:18 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > On Mon, 2010-08-30 at 09:59 +0300, Heikki Linnakangas wrote: > >> However, is it safe to use GetOldestXMin() during recovery? Or to put it >> other way, is GetOldestXMin() functioning correctly during hot standby? >> It only scans through the ProcArray, but not the known-assigned xids >> array. That seems like an oversight that needs to be fixed. > > Patch to implement that attached. The type of the value which KnownAssignedXidsGetOldestXmin returns should be TransactionId rather than int? Does caller of KnownAssignedXidsGetOldestXmin need to hold ProcArrayLock in (at least) shared mode? If yes, it's helpful to add that note as a comment. > Will then commit Fujii's patch. Thanks. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center