Thread: Small SSI issues
It makes wonders to take a couple of months break from looking at a piece of code, and then review it in detail again. It's like a whole new pair of eyes :-). Here's a bunch of small issues that I spotted: * The oldserxid code is broken for non-default BLCKSZ. o The warning will come either too early or too late o There isno safeguard against actually wrapping around the SLRU, just the warning o I'm suspicious of the OldSerXidPagePrecedesLogically()logic with 32k BLCKSZ. In that case, OLDSERXID_MAX_PAGE is a larger than necessaryto cover the whole range of 2^32 transactions, so at high XIDs, say 2^32-1, doesn't it incorrectly think thatlow XIDs, say 10, are in the past, not the future? We've discussed these SLRU issues already, but still.. > /* > * Keep a pointer to the currently-running serializable transaction (if any) > * for quick reference. > * TODO SSI: Remove volatile qualifier and the then-unnecessary casts? > */ > static volatile SERIALIZABLEXACT *MySerializableXact = InvalidSerializableXact; * So, should we remove it? In most places where contents of MySerializableXact are modified, we're holding SerializableXactHashLockin exclusive mode. However, there's two exceptions: o GetSafeSnapshot() modifies MySerializableXact->flags without any lock. It also inspects MySerializableXact->possibleUnsafeConflicts without a lock. What if somone sets some other flag in the flags bitmapjust when GetSafeSnapshot() is about to set SXACT_FLAG_DEFERRABLE_WAITING? One of the flags might be lost :-(. o CheckForSerializableConflictIn() sets SXACT_FLAG_DID_WRITE without holding a lock. The same danger is here if someoneelse tries to set some other flag concurrently. I think we should simply acquire SerializableXactHashLock in GetSafeSnapshot(). It shouldn't be so much of a hotspot thatit would make any difference in performance. CheckForSerializableConflictIn() might be called a lot, however, so maybewe need to check if the flag is set first, and only acquire the lock and set it if it's not set already. * Is the SXACT_FLAG_ROLLED_BACK flag necessary? It's only set in ReleasePredicateLocks() for a fleeting moment while thefunction releases all conflicts and locks held by the transaction, and finally the sxact struct itself containing theflag. Also, isn't a transaction that's already been marked for death the same as one that has already rolled back, forthe purposes of detecting conflicts? * The HavePartialClearThrough/CanPartialClearThrough mechanism needs a better explanation. The only explanation currentlyis this: > if (--(PredXact->WritableSxactCount) == 0) > { > /* > * Release predicate locks and rw-conflicts in for all committed > * transactions. There are no longer any transactions which might > * conflict with the locks and no chance for new transactions to > * overlap. Similarly, existing conflicts in can't cause pivots, > * and any conflicts in which could have completed a dangerous > * structure would already have caused a rollback, so any > * remaining ones must be benign. > */ > PredXact->CanPartialClearThrough = PredXact->LastSxactCommitSeqNo; > } If I understand that correctly, any predicate lock belonging to any already committed transaction can be safely zappedaway at that instant. We don't do it immediately, because it might be expensive. Instead, we set CanPartialClearThrough,and do it lazily in ClearOldPredicateLocks(). But what is the purpose of HavePartialClearThrough? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > Here's a bunch of small issues that I spotted: Thanks for going over it again. It is encouraging that you didn't spot any *big* issues. > * The oldserxid code is broken for non-default BLCKSZ. > o The warning will come either too early or too late Good point. That part is easily fixed -- if we want to keep the warning, in light of the next few points. > o There is no safeguard against actually wrapping around the > SLRU, just the warning Any thoughts on what we should do instead? If someone holds open a transaction long enough to burn through a billion transaction IDs (or possibly less if someone uses a smaller BLCKSZ), should we generate a FATAL error? Of course, one other option would be to allow more SLRU segment files, as you raised on another thread. Then this issue goes away because they would hit other, existing, protections against transaction wraparound first. > o I'm suspicious of the OldSerXidPagePrecedesLogically() logic > with 32k BLCKSZ. In that case, OLDSERXID_MAX_PAGE is a larger > than necessary to cover the whole range of 2^32 transactions, > so at high XIDs, say 2^32-1, doesn't it incorrectly think > that low XIDs, say 10, are in the past, not the future? I will look that over to see; but if this is broken, then one of the other SLRU users is probably broken, because I think I stole this code pretty directly from another spot. >> /* >> * Keep a pointer to the currently-running serializable >> * transaction (if any) for quick reference. >> * TODO SSI: Remove volatile qualifier and the then-unnecessary >> * casts? >> */ >> static volatile SERIALIZABLEXACT *MySerializableXact = >> InvalidSerializableXact; > > * So, should we remove it? In most places where contents of > MySerializableXact are modified, we're holding > SerializableXactHashLock in exclusive mode. However, there's > two exceptions: > > o GetSafeSnapshot() modifies MySerializableXact->flags without > any lock. It also inspects > MySerializableXact->possibleUnsafeConflicts without a lock. > What if somone sets some other flag in the flags bitmap just > when GetSafeSnapshot() is about to set > SXACT_FLAG_DEFERRABLE_WAITING? One of the flags might be lost > :-(. > > o CheckForSerializableConflictIn() sets SXACT_FLAG_DID_WRITE > without holding a lock. The same danger is here if someone > else tries to set some other flag concurrently. > > I think we should simply acquire SerializableXactHashLock in > GetSafeSnapshot(). It shouldn't be so much of a hotspot that it > would make any difference in performance. > CheckForSerializableConflictIn() might be called a lot, > however, so maybe we need to check if the flag is set first, > and only acquire the lock and set it if it's not set already. OK. Do checks such as that argue for keeping the volatile flag, or do you think we can drop it if we make those changes? (That would also allow dropping a number of casts which exist just to avoid warnings.) > * Is the SXACT_FLAG_ROLLED_BACK flag necessary? It's only set in > ReleasePredicateLocks() for a fleeting moment while the > function releases all conflicts and locks held by the > transaction, and finally the sxact struct itself containing the > flag. I think that one can go away. It had more of a point many months ago before we properly sorted out what belongs in PreCommit_CheckForSerializationFailure() and what belongs in ReleasePredicateLocks(). The point at which we reached clarity on that and moved things around, this flag probably became obsolete. > Also, isn't a transaction that's already been marked for death > the same as one that has already rolled back, for the purposes > of detecting conflicts? Yes. We should probably ignore any marked-for-death transaction during conflict detection and serialization failure detection. As a start, anywhere there is now a check for rollback and not for this, we should change it to this. There may be some places this can be checked which haven't yet been identified and touched. > * The HavePartialClearThrough/CanPartialClearThrough mechanism > needs a better explanation. The only explanation currently is > this: > >> if (--(PredXact->WritableSxactCount) == 0) >> { >> /* >> * Release predicate locks and rw-conflicts in for >> * all committed transactions. There are no longer any >> * transactions which might conflict with the locks and no >> * chance for new transactions to overlap. Similarly, >> * existing conflicts in can't cause pivots, and any >> * conflicts in which could have completed a dangerous >> * structure would already have caused a rollback, so any >> * remaining ones must be benign. >> */ >> PredXact->CanPartialClearThrough = >> PredXact->LastSxactCommitSeqNo; >> } > > If I understand that correctly, any predicate lock belonging to > any already committed transaction can be safely zapped away at > that instant. Correct. > We don't do it immediately, because it might be expensive. I think it has more to do with getting the LW locks right. We make the call to ClearOldPredicateLocks() farther down in the function, so this is effectively setting things up for that call. > But what is the purpose of HavePartialClearThrough? To avoid doing unnecessary work for completed transactions on which we still need to keep some information but for which we were previously able to clear predicate locks. This relates to the "mitigation" work discussed in these and other posts from around that time: http://archives.postgresql.org/pgsql-hackers/2010-10/msg01754.php http://archives.postgresql.org/pgsql-hackers/2010-12/msg02119.php I'm happy to work on modifications for any of this or to stay out of your way if you want to. Should I put together a patch for those items where we seem to agree and have a clear way forward? -Kevin
On 10.06.2011 18:05, Kevin Grittner wrote: > Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> wrote: >> o There is no safeguard against actually wrapping around the >> SLRU, just the warning > > Any thoughts on what we should do instead? If someone holds open a > transaction long enough to burn through a billion transaction IDs > (or possibly less if someone uses a smaller BLCKSZ), should we > generate a FATAL error? FATAL is a bit harsh, ERROR seems more appropriate. >> o I'm suspicious of the OldSerXidPagePrecedesLogically() logic >> with 32k BLCKSZ. In that case, OLDSERXID_MAX_PAGE is a larger >> than necessary to cover the whole range of 2^32 transactions, >> so at high XIDs, say 2^32-1, doesn't it incorrectly think >> that low XIDs, say 10, are in the past, not the future? > > I will look that over to see; but if this is broken, then one of the > other SLRU users is probably broken, because I think I stole this > code pretty directly from another spot. It was copied from async.c, which doesn't have this problem because it's not mapping XIDs to the slru. There, the max queue size is determined by the *_MAX_PAGE, and you point to a particular location in the SLRU with simply page+offset. >>> /* >>> * Keep a pointer to the currently-running serializable >>> * transaction (if any) for quick reference. >>> * TODO SSI: Remove volatile qualifier and the then-unnecessary >>> * casts? >>> */ >>> static volatile SERIALIZABLEXACT *MySerializableXact = >>> InvalidSerializableXact; >> >> * So, should we remove it? In most places where contents of >> MySerializableXact are modified, we're holding >> SerializableXactHashLock in exclusive mode. However, there's >> two exceptions: >> >> o GetSafeSnapshot() modifies MySerializableXact->flags without >> any lock. It also inspects >> MySerializableXact->possibleUnsafeConflicts without a lock. >> What if somone sets some other flag in the flags bitmap just >> when GetSafeSnapshot() is about to set >> SXACT_FLAG_DEFERRABLE_WAITING? One of the flags might be lost >> :-(. >> >> o CheckForSerializableConflictIn() sets SXACT_FLAG_DID_WRITE >> without holding a lock. The same danger is here if someone >> else tries to set some other flag concurrently. >> >> I think we should simply acquire SerializableXactHashLock in >> GetSafeSnapshot(). It shouldn't be so much of a hotspot that it >> would make any difference in performance. >> CheckForSerializableConflictIn() might be called a lot, >> however, so maybe we need to check if the flag is set first, >> and only acquire the lock and set it if it's not set already. > > OK. > > Do checks such as that argue for keeping the volatile flag, or do > you think we can drop it if we make those changes? (That would also > allow dropping a number of casts which exist just to avoid > warnings.) I believe we can drop it, I'll double-check. > I'm happy to work on modifications for any of this or to stay out of > your way if you want to. Should I put together a patch for those > items where we seem to agree and have a clear way forward? I'll fix the MySerializableXact locking issue, and come back to the other issues on Monday. If you have the time and energy to write a patch by then, feel free, but I can look into them otherwise. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Fri, Jun 10, 2011 at 09:43:58PM +0300, Heikki Linnakangas wrote: > > Do checks such as that argue for keeping the volatile flag, or do > > you think we can drop it if we make those changes? (That would also > > allow dropping a number of casts which exist just to avoid > > warnings.) > > I believe we can drop it, I'll double-check. Yes, dropping it seems like the thing to do. It's been on my list for a while. We are not really getting anything out of declaring it volatile since we cast the volatile qualifier away most of the time. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
> Dan Ports wrote: > On Fri, Jun 10, 2011 at 09:43:58PM +0300, Heikki Linnakangas wrote: >>> Do checks such as that argue for keeping the volatile flag, or do >>> you think we can drop it if we make those changes? (That would >>> also allow dropping a number of casts which exist just to avoid >>> warnings.) >> >> I believe we can drop it, I'll double-check. > > Yes, dropping it seems like the thing to do. It's been on my list > for a while. We are not really getting anything out of declaring it > volatile since we cast the volatile qualifier away most of the > time. I'm not concerned about references covered by SerializableXactHashLock. I am more concerned about some of the tests for whether the (MySerializableXact == InvalidSerializableXact) checks and any other tests not covered by that lock are OK without it (and OK with it). Since my knowledge of weak memory ordering behavior is, well, weak I didn't want to try to make that call. -Kevin
On Sat, Jun 11, 2011 at 01:38:31PM -0500, Kevin Grittner wrote: > I'm not concerned about references covered by > SerializableXactHashLock. I am more concerned about some of the > tests for whether the (MySerializableXact == InvalidSerializableXact) > checks and any other tests not covered by that lock are OK without it > (and OK with it). Since my knowledge of weak memory ordering > behavior is, well, weak I didn't want to try to make that call. Oh, those checks are definitely not an issue -- MySerializableXact itself (the pointer, not the thing it's pointing to) is in backend-local memory, so it won't change under us. The volatile qualifier (as written) doesn't help with that anyway, it attaches to the data being pointed to, not the pointer itself. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
> Heikki Linnakangas wrote: > On 10.06.2011 18:05, Kevin Grittner wrote: >> Heikki Linnakangas wrote: >>> o There is no safeguard against actually wrapping around the >>> SLRU, just the warning >> >> Any thoughts on what we should do instead? If someone holds open a >> transaction long enough to burn through a billion transaction IDs >> (or possibly less if someone uses a smaller BLCKSZ), should we >> generate a FATAL error? > > FATAL is a bit harsh, ERROR seems more appropriate. If we don't cancel the long-running transaction, don't we continue to have a problem? >> Do checks such as that argue for keeping the volatile flag, or do >> you think we can drop it if we make those changes? (That would >> also allow dropping a number of casts which exist just to avoid >> warnings.) > > I believe we can drop it, I'll double-check. I see you committed a patch for this, but there were some casts which become unnecessary with that change that you missed. Patch attached to clean up the ones I could spot. -Kevin
Attachment
On 12.06.2011 17:59, Kevin Grittner wrote: >> Heikki Linnakangas wrote: >> On 10.06.2011 18:05, Kevin Grittner wrote: >>> Heikki Linnakangas wrote: >>>> o There is no safeguard against actually wrapping around the >>>> SLRU, just the warning >>> >>> Any thoughts on what we should do instead? If someone holds open a >>> transaction long enough to burn through a billion transaction IDs >>> (or possibly less if someone uses a smaller BLCKSZ), should we >>> generate a FATAL error? >> >> FATAL is a bit harsh, ERROR seems more appropriate. > > If we don't cancel the long-running transaction, don't we continue to > have a problem? Yes, but ERROR is enough to kill the transaction. Unless it's in a subtransaction, I guess. But anyway, there's no guarantee that the long-running transaction will hit the problem first, or at all. You're much more likely to kill an innocent new transaction that tries to acquire its first locks, than an old long-running transaction that's been running for a while already, probably idle doing nothing, or doing a long seqscan on a large table with the whole table locked already. > I see you committed a patch for this, but there were some casts which > become unnecessary with that change that you missed. Patch attached > to clean up the ones I could spot. Ah, thanks, applied. I didn't realize those were also only needed because of the volatile qualifier. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On 10.06.2011 18:05, Kevin Grittner wrote: > Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> wrote: >> * Is the SXACT_FLAG_ROLLED_BACK flag necessary? It's only set in >> ReleasePredicateLocks() for a fleeting moment while the >> function releases all conflicts and locks held by the >> transaction, and finally the sxact struct itself containing the >> flag. > > I think that one can go away. It had more of a point many months > ago before we properly sorted out what belongs in > PreCommit_CheckForSerializationFailure() and what belongs in > ReleasePredicateLocks(). The point at which we reached clarity on > that and moved things around, this flag probably became obsolete. > >> Also, isn't a transaction that's already been marked for death >> the same as one that has already rolled back, for the purposes >> of detecting conflicts? > > Yes. > > We should probably ignore any marked-for-death transaction during > conflict detection and serialization failure detection. As a start, > anywhere there is now a check for rollback and not for this, we > should change it to this. Ok, I removed the SXACT_FLAG_ROLLED_BACK flag. I also renamed the marked-for-death flag into SXACT_FLAG_DOOMED; that's a lot shorter. > There may be some places this can be > checked which haven't yet been identified and touched. Yeah - in 9.2. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: >> There may be some places this can be checked which haven't yet >> been identified and touched. > > Yeah - in 9.2. No argument here. I'm all for stabilizing and getting the thing out -- I think we've established that performance is good for many workloads as it stands, and that there are workloads where it will never be useful. Chipping away at the gray area, to make it perform well in a few workloads where it currently doesn't (and, of course, *even better* in workloads where it's currently better than the alternatives), seems like future release work to me. There is one issue you raised in this post: http://archives.postgresql.org/message-id/4DEF3194.6030305@enterprisedb.com Robert questioned whether it should be 9.1 material here: http://archives.postgresql.org/message-id/BANLkTint2i2fHDTdr=Xq3K=YrxegovGmTw@mail.gmail.com I posted a patch here: http://archives.postgresql.org/message-id/4DEFB169020000250003E3BD@gw.wicourts.gov Should I put that patch into a 9.2 CF? There is an unnecessary include of predicate.h in nbtree.c we should delete. That seems safe enough. You questioned whether OldSerXidPagePrecedesLogically was buggy. I will look at that by this weekend at the latest. If it is buggy we obviously should fix that. Are there any other changes you think we should make to handle the odd corner cases in the SLRU for SSI? It did occur to me that we should be safe from actual overwriting of an old entry by the normal transaction wrap-around protections -- the worst that should happen with the current code (I think) is that in extreme cases we may get LOG level messages or accumulate a surprising number of SLRU segment files. That's because SLRU will start to nag about things at one billion transactions, but we need to get all the way to two billion transactions used up while a single serializable transaction remains active before we could overwrite something. It seems like it might be a good idea to apply pgindent formating to the latest SSI changes, to minimize conflict on back-patching any bug fixes. I've attached a patch to do this formatting -- entirely whitespace changes from a pgindent run against selected files. Unless I'm missing something, the only remaining changes needed are for documentation (as mentioned in previous posts). I will work on those after I look at OldSerXidPagePrecedesLogically. -Kevin
Attachment
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote: > Unless I'm missing something, the only remaining changes needed > are for documentation (as mentioned in previous posts). I just found notes that we also need regression tests for the SSI/DDL combination and a comment in lazy_truncate_heap. At any rate, not anything which is part of executable code.... -Kevin
On 15.06.2011 19:10, Kevin Grittner wrote: > There is an unnecessary include of predicate.h in nbtree.c we should > delete. That seems safe enough. >... > It seems like it might be a good idea to apply pgindent formating to > the latest SSI changes, to minimize conflict on back-patching any > bug fixes. I've attached a patch to do this formatting -- entirely > whitespace changes from a pgindent run against selected files. Ok, committed the pgindent patch and removed the unnecessary include. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On 15.06.2011 19:10, Kevin Grittner wrote: > There is one issue you raised in this post: > > http://archives.postgresql.org/message-id/4DEF3194.6030305@enterprisedb.com > > Robert questioned whether it should be 9.1 material here: > > http://archives.postgresql.org/message-id/BANLkTint2i2fHDTdr=Xq3K=YrxegovGmTw@mail.gmail.com > > I posted a patch here: > > http://archives.postgresql.org/message-id/4DEFB169020000250003E3BD@gw.wicourts.gov > > Should I put that patch into a 9.2 CF? Yeah. I've added it to the September commitfest. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas wrote: > * The oldserxid code is broken for non-default BLCKSZ. > o The warning will come either too early or too late > o There is no safeguard against actually wrapping around the > SLRU, just the warning > o I'm suspicious of the OldSerXidPagePrecedesLogically() logic > with 32k BLCKSZ. In that case, OLDSERXID_MAX_PAGE is a larger > than necessary to cover the whole range of 2^32 transactions, > so at high XIDs, say 2^32-1, doesn't it incorrectly think that > low XIDs, say 10, are in the past, not the future? It took me a while to see these problems because somehow I had forgotten that SLRU_PAGES_PER_SEGMENT was a literal of 32 rather than being based on BLCKSZ. After I rediscovered that, your concern was clear enough. I think the attached patch addresses the problem with the OldSerXidPagePrecedesLogically() function, which strikes me as the most serious issue. Based on the calculation from the attached patch, it would be easy to adjust the warning threshold, but I'm wondering if we should just rip it out instead. As I mentioned in a recent post based on reviewing your concerns, with an 8KB BLCKSZ the SLRU system will start thinking we're into wraparound at one billion transactions, and refuse to truncate segment files until we get down to less than that, but we won't actually overwrite anything and cause SSI misbehaviors until we eat through two billion (2^31 really) transactions while holding open a single read-write transaction. At that point I think PostgreSQL has other defenses which come into play. With the attached patch I don't think we can have any such problems with a *larger* BLCKSZ, so the only point of the warning would be for a BLCKSZ of 4KB or less. Is it worth carrying the warning code (with an appropriate adjustment to the thresholds) or should we drop it? -Kevin
Attachment
On Sun, Jun 19, 2011 at 7:17 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Heikki Linnakangas wrote: > >> * The oldserxid code is broken for non-default BLCKSZ. >> o The warning will come either too early or too late >> o There is no safeguard against actually wrapping around the >> SLRU, just the warning >> o I'm suspicious of the OldSerXidPagePrecedesLogically() logic >> with 32k BLCKSZ. In that case, OLDSERXID_MAX_PAGE is a larger >> than necessary to cover the whole range of 2^32 transactions, >> so at high XIDs, say 2^32-1, doesn't it incorrectly think that >> low XIDs, say 10, are in the past, not the future? > > It took me a while to see these problems because somehow I had > forgotten that SLRU_PAGES_PER_SEGMENT was a literal of 32 rather than > being based on BLCKSZ. After I rediscovered that, your concern was > clear enough. > > I think the attached patch addresses the problem with the > OldSerXidPagePrecedesLogically() function, which strikes me as the > most serious issue. > > Based on the calculation from the attached patch, it would be easy to > adjust the warning threshold, but I'm wondering if we should just rip > it out instead. As I mentioned in a recent post based on reviewing > your concerns, with an 8KB BLCKSZ the SLRU system will start thinking > we're into wraparound at one billion transactions, and refuse to > truncate segment files until we get down to less than that, but we > won't actually overwrite anything and cause SSI misbehaviors until we > eat through two billion (2^31 really) transactions while holding open > a single read-write transaction. At that point I think PostgreSQL > has other defenses which come into play. With the attached patch I > don't think we can have any such problems with a *larger* BLCKSZ, so > the only point of the warning would be for a BLCKSZ of 4KB or less. > Is it worth carrying the warning code (with an appropriate adjustment > to the thresholds) or should we drop it? Is this still an open item? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> wrote: > On Sun, Jun 19, 2011 at 7:17 PM, Kevin Grittner > <Kevin.Grittner@wicourts.gov> wrote: >> Heikki Linnakangas wrote: >> >>> * The oldserxid code is broken for non-default BLCKSZ. >>> o The warning will come either too early or too late >>> o There is no safeguard against actually wrapping around the >>> SLRU, just the warning >>> o I'm suspicious of the OldSerXidPagePrecedesLogically() logic >>> with 32k BLCKSZ. In that case, OLDSERXID_MAX_PAGE is a larger >>> than necessary to cover the whole range of 2^32 transactions, >>> so at high XIDs, say 2^32-1, doesn't it incorrectly think that >>> low XIDs, say 10, are in the past, not the future? >> >> It took me a while to see these problems because somehow I had >> forgotten that SLRU_PAGES_PER_SEGMENT was a literal of 32 rather >> than being based on BLCKSZ. After I rediscovered that, your >> concern was clear enough. >> >> I think the attached patch addresses the problem with the >> OldSerXidPagePrecedesLogically() function, which strikes me as >> the most serious issue. >> >> Based on the calculation from the attached patch, it would be >> easy to adjust the warning threshold, but I'm wondering if we >> should just rip it out instead. As I mentioned in a recent post >> based on reviewing your concerns, with an 8KB BLCKSZ the SLRU >> system will start thinking we're into wraparound at one billion >> transactions, and refuse to truncate segment files until we get >> down to less than that, but we won't actually overwrite anything >> and cause SSI misbehaviors until we eat through two billion (2^31 >> really) transactions while holding open a single read-write >> transaction. At that point I think PostgreSQL has other defenses >> which come into play. With the attached patch I don't think we >> can have any such problems with a *larger* BLCKSZ, so the only >> point of the warning would be for a BLCKSZ of 4KB or less. Is it >> worth carrying the warning code (with an appropriate adjustment >> to the thresholds) or should we drop it? > > Is this still an open item? Yes, although I'm not clear on whether people feel it is one which needs to be fixed for 9.1 or left for 9.2. On a build with a BLCKSZ less than 8KB we would not get a warning before problems occurred, and we would have more serious problem involving potentially incorrect behavior. Tom questioned whether anyone actually builds with BLCKSZ less than 8KB, and I'm not altogether sure that SLRUs dealing with transaction IDs would behave correctly either. On block sizes larger than 8KB it will the warning if you burn through one billion transactions while holding one serializable read write transaction open, even though there won't be a problem. With the larger BLCKSZ values it may also generate log level messages about SLRU wraparound when that's not really a problem. The patch posted with the quoted message will prevent the misbehavior on smaller block sizes and the bogus log messages on larger block sizes. We'd need to change a couple more lines to get the warning to trigger at the appropriate time for different block sizes -- or we could just rip out the warning code. (I didn't post a patch for that because there wasn't a clear consensus about whether to fix it, rip it out, or leave it alone for 9.1.) -Kevin
On Tue, Jul 5, 2011 at 10:51 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: >> Is this still an open item? > > Yes, although I'm not clear on whether people feel it is one which > needs to be fixed for 9.1 or left for 9.2. > > On a build with a BLCKSZ less than 8KB we would not get a warning > before problems occurred, and we would have more serious problem > involving potentially incorrect behavior. Tom questioned whether > anyone actually builds with BLCKSZ less than 8KB, and I'm not > altogether sure that SLRUs dealing with transaction IDs would behave > correctly either. > > On block sizes larger than 8KB it will the warning if you burn > through one billion transactions while holding one serializable read > write transaction open, even though there won't be a problem. With > the larger BLCKSZ values it may also generate log level messages > about SLRU wraparound when that's not really a problem. Well, as long as we can verify that OLDSERXID_MAX_PAGE has the same value for BLCKSZ=8K before and after this patch, I don't see any real downside to applying it. If, hypothetically, it's buggy, it's only going to break things for non-default block sizes which are, by your description, not correct right now anyway. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> wrote: > Well, as long as we can verify that OLDSERXID_MAX_PAGE has the > same value for BLCKSZ=8K before and after this patch, I don't see > any real downside to applying it. If, hypothetically, it's buggy, > it's only going to break things for non-default block sizes which > are, by your description, not correct right now anyway. Outside of a code comment, the entire patch consists of replacing the definition of the OLDSERXID_MAX_PAGE macro. The old definition is: (SLRU_PAGES_PER_SEGMENT * 0x10000 - 1) SLRU_PAGES_PER_SEGMENT is define to be 32. So this is: (32 * 0x10000) - 1 = 2097151 The new definition is the min of the old one and a value based on BLCKSZ: (MaxTransactionId + 1) / OLDSERXID_ENTRIESPERPAGE - 1) Where "entries per page" is BLCKSZ / sizeof(uint64). For an 8K BLCKSZ this is: ((0xffffffff + 1) / 1024) - 1 = 4194303 So the macro is guaranteed to have the same value as it currently does for BLCKSZ of 16KB or lower. -Kevin
On Tue, Jul 5, 2011 at 12:03 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Robert Haas <robertmhaas@gmail.com> wrote: > >> Well, as long as we can verify that OLDSERXID_MAX_PAGE has the >> same value for BLCKSZ=8K before and after this patch, I don't see >> any real downside to applying it. If, hypothetically, it's buggy, >> it's only going to break things for non-default block sizes which >> are, by your description, not correct right now anyway. > > Outside of a code comment, the entire patch consists of replacing > the definition of the OLDSERXID_MAX_PAGE macro. The old definition > is: > > (SLRU_PAGES_PER_SEGMENT * 0x10000 - 1) > > SLRU_PAGES_PER_SEGMENT is define to be 32. So this is: > > (32 * 0x10000) - 1 = 2097151 > > The new definition is the min of the old one and a value based on > BLCKSZ: > > (MaxTransactionId + 1) / OLDSERXID_ENTRIESPERPAGE - 1) > > Where "entries per page" is BLCKSZ / sizeof(uint64). > > For an 8K BLCKSZ this is: > > ((0xffffffff + 1) / 1024) - 1 = 4194303 > > So the macro is guaranteed to have the same value as it currently > does for BLCKSZ of 16KB or lower. I went ahead and committed this. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company