Thread: reindex creates predicate lock on index root
During testing of the SSI DDL changes I noticed that a REINDEX INDEX created a predicate lock on page 0 of the index. This is pretty harmless, but mildly annoying. There are a few other places where it would be good to suppress predicate locks; these are listed on the R&D section of the Wiki. I hope to clean some of these up in 9.2. Unless a very clean and safe fix for the subject issue pops out on further review, I'll add this to that list. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > During testing of the SSI DDL changes I noticed that a REINDEX INDEX > created a predicate lock on page 0 of the index. This is pretty > harmless, but mildly annoying. There are a few other places where > it would be good to suppress predicate locks; these are listed on > the R&D section of the Wiki. I hope to clean some of these up in > 9.2. Unless a very clean and safe fix for the subject issue pops out > on further review, I'll add this to that list. Do you mean page zero, as in the metapage (for most index types), or do you mean the root page? If the former, how is that not an outright bug, since it corresponds to no data? If the latter, how is that not a serious performance problem, since it corresponds to locking the entire index? Any way you slice it, it sounds like a pretty bad bug. It's not apparent to me why an index build (regular or reindex) should create any predicate locks at all, ever. Surely it's a basically nontransactional operation that SSI should keep its fingers out of. regards, tom lane
On Tue, Jun 07, 2011 at 07:45:43PM -0500, Kevin Grittner wrote: > During testing of the SSI DDL changes I noticed that a REINDEX INDEX > created a predicate lock on page 0 of the index. Really? That surprises me, and I couldn't reproduce it just now. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
Dan Ports wrote: > On Tue, Jun 07, 2011 at 07:45:43PM -0500, Kevin Grittner wrote: >> During testing of the SSI DDL changes I noticed that a REINDEX >> INDEX created a predicate lock on page 0 of the index. > > Really? That surprises me, and I couldn't reproduce it just now. You're right; that one was a false alarm. While REINDEX was reading the heap to build the index it got an SIREAD lock on a *heap* page. While that could arguably be avoided, even though the heap is being read in a serializable transaction, I'm not inclined to get really excited about it. If someone wants to dodge it, they can always run the REINDEX in READ COMMITTED or REPEATABLE READ mode. Maybe 9.2 material if there's nothing to do that matters more than that. "Move along; there's nothing to see here, folks...." -Kevin
Excerpts from Kevin Grittner's message of mar jun 07 20:45:43 -0400 2011: > During testing of the SSI DDL changes I noticed that a REINDEX INDEX > created a predicate lock on page 0 of the index. Err, in a btree, page 0 is the metapage, not the root. The root page moves around, but it's never page 0 (it starts at page 1). I think GIN indexes also have metapages, not sure about the others. Not sure if this changes your reasoning at all, just thought it was worth mentioning. > This is pretty > harmless, but mildly annoying. There are a few other places where > it would be good to suppress predicate locks; these are listed on > the R&D section of the Wiki. I hope to clean some of these up in > 9.2. Unless a very clean and safe fix for the subject issue pops out > on further review, I'll add this to that list. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On 08.06.2011 06:37, Kevin Grittner wrote: > You're right; that one was a false alarm. While REINDEX was reading > the heap to build the index it got an SIREAD lock on a *heap* page. We never take locks on heap pages directly, so it must've been a promotion from heap tuple locks. > While that could arguably be avoided, even though the heap is being > read in a serializable transaction, I'm not inclined to get really > excited about it. If someone wants to dodge it, they can always run > the REINDEX in READ COMMITTED or REPEATABLE READ mode. Maybe 9.2 > material if there's nothing to do that matters more than that. That should be pretty easy to avoid, though. I think we should fix it now. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Tue, Jun 07, 2011 at 10:14:30PM -0400, Tom Lane wrote: > Do you mean page zero, as in the metapage (for most index types), or do > you mean the root page? If the former, how is that not an outright bug, > since it corresponds to no data? If the latter, how is that not a > serious performance problem, since it corresponds to locking the entire > index? Any way you slice it, it sounds like a pretty bad bug. It's a moot point since that isn't actually happening, but FYI, we only acquire and check for locks on b-tree leaf pages so locking the root wouldn't have any effect. (This won't be true for other index types; GIST comes to mind.) > It's not apparent to me why an index build (regular or reindex) should > create any predicate locks at all, ever. Surely it's a basically > nontransactional operation that SSI should keep its fingers out of. It shouldn't. What's happening is that when IndexBuildHeapScan reads the heap tuples, heap_getnext takes a lock if it's running inside a serializable transaction. It doesn't (yet) know that it doesn't need the locks for an index build. We could set a flag in the HeapScanDesc to indicate this. Actually, setting rs_relpredicatelocked has exactly that effect, so we ought to be able to use that (but might want to change the name). Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
On Wed, Jun 8, 2011 at 4:59 AM, Dan Ports <drkp@csail.mit.edu> wrote: > On Tue, Jun 07, 2011 at 10:14:30PM -0400, Tom Lane wrote: >> Do you mean page zero, as in the metapage (for most index types), or do >> you mean the root page? If the former, how is that not an outright bug, >> since it corresponds to no data? If the latter, how is that not a >> serious performance problem, since it corresponds to locking the entire >> index? Any way you slice it, it sounds like a pretty bad bug. > > It's a moot point since that isn't actually happening, but FYI, we only > acquire and check for locks on b-tree leaf pages so locking the root > wouldn't have any effect. (This won't be true for other index types; > GIST comes to mind.) > >> It's not apparent to me why an index build (regular or reindex) should >> create any predicate locks at all, ever. Surely it's a basically >> nontransactional operation that SSI should keep its fingers out of. > > It shouldn't. What's happening is that when IndexBuildHeapScan reads > the heap tuples, heap_getnext takes a lock if it's running inside a > serializable transaction. It doesn't (yet) know that it doesn't need > the locks for an index build. > > We could set a flag in the HeapScanDesc to indicate this. Actually, > setting rs_relpredicatelocked has exactly that effect, so we ought to > be able to use that (but might want to change the name). I'm wondering if this shouldn't be linked to whether the scan is using an MVCC snapshot, rather than inserting exceptions for specific operations. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> wrote: > I'm wondering if this shouldn't be linked to whether the scan is > using an MVCC snapshot, rather than inserting exceptions for > specific operations. Yeah, that was raised before somewhere and I spaced it. Grabbing predicate locks for non-MVCC snapshots is nonsense, and the fix is a one-line addition to the SkipSerialization macro defined and used in predicate.c. Patch attached. I agree with your other post that changes which are in the nature of improving performance (which for SSI includes changes which reduce the number of false positive serialization failures for serializable transactions) should be viewed very cautiously in terms of 9.1 inclusion. We're already at a point where a DBT-2 benchmark only shows a 2% hit for SSI overhead, including transaction restarts for serialization failures. I'd love to get that down to 1% or lower, but I don't want to create any risk of introducing bugs in 9.1 at this point. I think this one-liner might be worth considering, since it is very low risk and fixes an undesirable behavior which some (Tom, at least, it would appear) would have no trouble categorizing as a bug. The attached patch has not yet been tested, but I'll test it today along with the latest committed code. -Kevin
Attachment
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > *** a/src/backend/storage/lmgr/predicate.c > --- b/src/backend/storage/lmgr/predicate.c > *************** > *** 274,279 **** > --- 274,280 ---- > #define SkipSerialization(relation) \ > ((!IsolationIsSerializable()) \ > || ((MySerializableXact == InvalidSerializableXact)) \ > + || (!IsMVCCSnapshot(GetActiveSnapshot())) \ > || ReleasePredicateLocksIfROSafe() \ > || SkipPredicateLocksForRelation(relation)) While I agree with the goal here, this implementation seems fairly dangerous. The recommendation was to check *the snapshot being used in the scan*, and I think you have to do it that way. This macro isn't necessarily checking the right snapshot. What's more, if it's ever used in a place where there is no "active" snapshot, it'd dump core outright. I think you probably need to add the snapshot as an explicit parameter to the macro if you're going to do this. BTW, am I reading the function names right to suspect that ReleasePredicateLocksIfROSafe might be something with side-effects? Yuck. regards, tom lane
On 08.06.2011 18:09, Kevin Grittner wrote: > Robert Haas<robertmhaas@gmail.com> wrote: > >> I'm wondering if this shouldn't be linked to whether the scan is >> using an MVCC snapshot, rather than inserting exceptions for >> specific operations. > > Yeah, that was raised before somewhere and I spaced it. Grabbing > predicate locks for non-MVCC snapshots is nonsense, and the fix is a > one-line addition to the SkipSerialization macro defined and used in > predicate.c. Patch attached. > > I agree with your other post that changes which are in the nature of > improving performance (which for SSI includes changes which reduce > the number of false positive serialization failures for serializable > transactions) should be viewed very cautiously in terms of 9.1 > inclusion. We're already at a point where a DBT-2 benchmark only > shows a 2% hit for SSI overhead, including transaction restarts for > serialization failures. I'd love to get that down to 1% or lower, > but I don't want to create any risk of introducing bugs in 9.1 at > this point. I think this one-liner might be worth considering, > since it is very low risk and fixes an undesirable behavior which > some (Tom, at least, it would appear) would have no trouble > categorizing as a bug. I also think this should be fixed. > The attached patch has not yet been tested, but I'll test it today > along with the latest committed code. You can't use GetActiveSnapshot() for this. You can have one snapshot pushed to the active snapshot stack, and do a DDL operation like reindex using a different snapshot. You'll have to check the snapshot in the HeapScanDesc. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: >> The attached patch has not yet been tested, but I'll test it >> today along with the latest committed code. > > You can't use GetActiveSnapshot() for this. Yeah, it didn't take much testing to find that out. I had a naive assumption that the GetActiveSnapshot would return whatever snapshot was in use at the point of the call. > You can have one snapshot pushed to the active snapshot stack, and > do a DDL operation like reindex using a different snapshot. You'll > have to check the snapshot in the HeapScanDesc. Will look at that. Do you think it makes more sense to pass in the snapshot on all these calls and test it within predicate.c, or condition the calls on this? -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: >> You can have one snapshot pushed to the active snapshot stack, and >> do a DDL operation like reindex using a different snapshot. You'll >> have to check the snapshot in the HeapScanDesc. > Will look at that. Do you think it makes more sense to pass in the > snapshot on all these calls and test it within predicate.c, or > condition the calls on this? I'd vote for passing in the snapshot. I can foresee wanting to know exactly which snapshot is in question in this code, anyway. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> *** a/src/backend/storage/lmgr/predicate.c >> --- b/src/backend/storage/lmgr/predicate.c >> *************** >> *** 274,279 **** >> --- 274,280 ---- >> #define SkipSerialization(relation) \ >> ((!IsolationIsSerializable()) \ >> || ((MySerializableXact == InvalidSerializableXact)) \ >> + || (!IsMVCCSnapshot(GetActiveSnapshot())) \ >> || ReleasePredicateLocksIfROSafe() \ >> || SkipPredicateLocksForRelation(relation)) > BTW, am I reading the function names right to suspect that > ReleasePredicateLocksIfROSafe might be something with > side-effects? > Yuck. I'm open to other suggestions on this. Let me explain what's up here. A READ ONLY transaction can have a "safe" snapshot, in which case it doesn't need to acquire predicate locks or participate in dangerous structure detection. We check for this at transaction startup and start the transaction with MySerializableXact set to InvalidSerializableXact in that case, so it would never make it past that test. We also have the new DEFERRABLE transaction property which causes transaction startup to *wait* for a safe snapshot. (SIREAD locks never block anything; this request for a SERIALIZABLE READ ONLY DEFERRABLE transaction is the only blocking introduced in SSI.) But a READ ONLY transaction's snapshot can *become* safe while it is running. We felt it was worthwhile to watch for this and flag the transaction as safe, to minimize predicate lock space used, CPU consumed, LW lock contention, and false positive serialization failures. While this is actually detected, and the transaction flagged as safe, during commit or rollback of a READ WRITE transaction, proper cleanup can only be done (at least without a lot of new mechanism and locking) by the now-safe transaction's process. The points where it makes sense to check this and do the cleanup correspond exactly to the places where this macro is called. We could take ReleasePredicateLocksIfROSafe() out of the SkipSerialization(relation) macro, but we'd need to ensure that these macros are always called as a pair, and even then we wouldn't be calling it in the place where it makes the most sense. So, while this construction does make one squirm a little, it seems a sane accommodation to reality. If anyone can spot a cleaner way to do it, that'd be great. -Kevin