Thread: pgsql: Resolve timing issue with logging locks for Hot Standby.
Resolve timing issue with logging locks for Hot Standby. We log AccessExclusiveLocks for replay onto standby nodes, but because of timing issues on ProcArray it is possible to log a lock that is still held by a just committed transaction that is very soon to be removed. To avoid any timing issue we avoid applying locks made by transactions with InvalidXid. Simon Riggs, bug report Tom Lane, diagnosis Pavan Deolasee Branch ------ master Details ------- http://git.postgresql.org/pg/commitdiff/c172b7b02e6f6008d6dad66ddee8f67faf223c5b Modified Files -------------- src/backend/storage/ipc/procarray.c | 8 +-- src/backend/storage/ipc/standby.c | 110 ++++++++++++++++++++++++----------- src/backend/storage/lmgr/lock.c | 12 ++++- src/include/storage/standby.h | 2 +- 4 files changed, 88 insertions(+), 44 deletions(-)
Simon Riggs <simon@2ndQuadrant.com> writes: > Resolve timing issue with logging locks for Hot Standby. > We log AccessExclusiveLocks for replay onto standby nodes, > but because of timing issues on ProcArray it is possible to > log a lock that is still held by a just committed transaction > that is very soon to be removed. To avoid any timing issue we > avoid applying locks made by transactions with InvalidXid. > Simon Riggs, bug report Tom Lane, diagnosis Pavan Deolasee I see this was only applied to HEAD. Wouldn't back-patching be in order? The problem is in 9.1 as well, no? regards, tom lane
On Mon, Jan 30, 2012 at 12:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Simon Riggs <simon@2ndQuadrant.com> writes: >> Resolve timing issue with logging locks for Hot Standby. >> We log AccessExclusiveLocks for replay onto standby nodes, >> but because of timing issues on ProcArray it is possible to >> log a lock that is still held by a just committed transaction >> that is very soon to be removed. To avoid any timing issue we >> avoid applying locks made by transactions with InvalidXid. > >> Simon Riggs, bug report Tom Lane, diagnosis Pavan Deolasee > > I see this was only applied to HEAD. Wouldn't back-patching be in > order? The problem is in 9.1 as well, no? Yes, it is. I prefer to give a little time before backpatching to avoid mistakes (of my own making), especially since we're busy enough not to want to divert energy to other releases right now. The patch will make it in before next minor release. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Jan 30, 2012 at 4:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On Mon, Jan 30, 2012 at 12:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Simon Riggs <simon@2ndQuadrant.com> writes: >>> Resolve timing issue with logging locks for Hot Standby. >>> We log AccessExclusiveLocks for replay onto standby nodes, >>> but because of timing issues on ProcArray it is possible to >>> log a lock that is still held by a just committed transaction >>> that is very soon to be removed. To avoid any timing issue we >>> avoid applying locks made by transactions with InvalidXid. >> >>> Simon Riggs, bug report Tom Lane, diagnosis Pavan Deolasee >> >> I see this was only applied to HEAD. Wouldn't back-patching be in >> order? The problem is in 9.1 as well, no? > > Yes, it is. I prefer to give a little time before backpatching to > avoid mistakes (of my own making), especially since we're busy enough > not to want to divert energy to other releases right now. The patch > will make it in before next minor release. If it's going to go in before the next minor release, there's no real value in holding off, is there? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Jan 30, 2012 at 4:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> On Mon, Jan 30, 2012 at 12:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I see this was only applied to HEAD. �Wouldn't back-patching be in >>> order? �The problem is in 9.1 as well, no? >> Yes, it is. I prefer to give a little time before backpatching to >> avoid mistakes (of my own making), especially since we're busy enough >> not to want to divert energy to other releases right now. The patch >> will make it in before next minor release. > If it's going to go in before the next minor release, there's no real > value in holding off, is there? It seems like a strange approach to me as well. The longer you wait, the more the patch details will have faded from your mind, increasing the chance that you'll mess up any necessary cross-branch adjustments. I also consider the possibility to be non-zero that it won't get done at all before the next release. Another aspect of this is that when the same-or-almost-the-same patch is applied to multiple branches, it's a *lot* easier for future archeology in the commit logs if the patch goes into all the affected branches at the same time (and with the same commit message, please). Lastly, one of the main reasons for wanting this particular bug fixed (since after all it's just an Assert failure in the end) is to not have transient buildfarm failures from it. Those failures are still happening, in 9.1 if not HEAD, and still distracting attention from more significant issues. regards, tom lane
On 01/30/2012 10:45 AM, Tom Lane wrote: > Another aspect of this is that when the same-or-almost-the-same patch > is applied to multiple branches, it's a *lot* easier for future > archeology in the commit logs if the patch goes into all the affected > branches at the same time (and with the same commit message, please). Yeah, I've been pinged on this before, for exactly this reason. It makes release note preparation significantly easier, AIUI. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > On 01/30/2012 10:45 AM, Tom Lane wrote: >> Another aspect of this is that when the same-or-almost-the-same patch >> is applied to multiple branches, it's a *lot* easier for future >> archeology in the commit logs if the patch goes into all the affected >> branches at the same time (and with the same commit message, please). > Yeah, I've been pinged on this before, for exactly this reason. It makes > release note preparation significantly easier, AIUI. Not just release notes, but routine questions like "when was this bug fixed, and in which branches?". I keep a regularly-updated copy of the output of src/tools/git_changelog (before git we used cvs2cl to get similar output from CVS). It's a rare day that I don't have some occasion to consult it. I don't appreciate committers randomly cluttering that log by making commits that the tool doesn't know how to merge. regards, tom lane
On Mon, Jan 30, 2012 at 4:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> On 01/30/2012 10:45 AM, Tom Lane wrote: >>> Another aspect of this is that when the same-or-almost-the-same patch >>> is applied to multiple branches, it's a *lot* easier for future >>> archeology in the commit logs if the patch goes into all the affected >>> branches at the same time (and with the same commit message, please). > >> Yeah, I've been pinged on this before, for exactly this reason. It makes >> release note preparation significantly easier, AIUI. > > Not just release notes, but routine questions like "when was this bug > fixed, and in which branches?". I keep a regularly-updated copy of > the output of src/tools/git_changelog (before git we used cvs2cl > to get similar output from CVS). It's a rare day that I don't have > some occasion to consult it. I don't appreciate committers randomly > cluttering that log by making commits that the tool doesn't know how > to merge. As you request it, I will endeavour to do this next time. I think its more likely errors will occur that way, however. Responsibility for any such errors is still mine so there may be times where that outweighs other thoughts, but as policy I'm happy to make apply-all-at-once the default. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services