Thread: Refactoring speculative insertion with unique indexes a little
Currently, speculative insertion (the INSERT ... ON CONFLICT DO UPDATE executor/storage infrastructure) uses checkUnique == UNIQUE_CHECK_PARTIAL for unique indexes, which is a constant originally only used by deferred unique constraints. It occurred to me that this has a number of disadvantages: * It confuses separation of concerns. Pushing down this information to the nbtree AM makes it clear why it's slightly special from a speculative insertion point of view. For example, the nbtree AM does not actually require "livelock insurance" (for unique indexes, although in principle not for nbtree-based exclusion constraints, which are possible). * UNIQUE_CHECK_PARTIAL is not only not the same thing as UNIQUE_CHECK_SPECULATIVE (a new constant for the enum). It's also naturally mutually exclusive with it (since we do not and cannot support deferred unique constraints as arbiters). Let's represent this directly. * It makes a conflict not detected by the pre-check always insert an index tuple, even though that occurs after a point where it's already been established that the pointed-to TID is doomed -- it must go on to be super deleted. Why bother bloating the index? I'm actually not really motivated by wanting to reduce bloat here (that was always something that I thought was a non-issue with *any* implemented speculative insertion prototype [1]). Rather, by actually physically inserting an index tuple unnecessarily, the implication is that it makes sense to do so (perhaps for roughly the same reason it makes sense with deferred unique constraints, or some other complicated and subtle reason.). AFAICT that implication is incorrect, though; I see no reason why inserting that index tuple serves any purpose, and it clearly *can* be avoided with little effort. Attached patch updates speculative insertion along these lines. In passing, I have make ExecInsertIndexTuples() give up when a speculative insertion conflict is detected. Again, this is not about bloat prevention; it's about making the code easier to understand by not doing something that is unnecessary, and in avoiding that also avoiding the implication that it is necessary. There are already enough complicated interactions that *are* necessary (e.g. "livelock avoidance" for exclusion constraints). Let us make our intent clearer. The patch also updates the AM interface documentation (the part covering unique indexes). It seems quite natural to me to document the theory of operation for speculative insertion there. Thoughts? [1] https://wiki.postgresql.org/wiki/Value_locking#.232._.22Promise.22_heap_tuples_.28Heikki_Linnakangas.29 -- Peter Geoghegan
Attachment
On 06/11/2015 02:19 AM, Peter Geoghegan wrote: > Currently, speculative insertion (the INSERT ... ON CONFLICT DO UPDATE > executor/storage infrastructure) uses checkUnique == > UNIQUE_CHECK_PARTIAL for unique indexes, which is a constant > originally only used by deferred unique constraints. It occurred to me > that this has a number of disadvantages: > > * It confuses separation of concerns. Pushing down this information to > the nbtree AM makes it clear why it's slightly special from a > speculative insertion point of view. For example, the nbtree AM does > not actually require "livelock insurance" (for unique indexes, > although in principle not for nbtree-based exclusion constraints, > which are possible). > > * UNIQUE_CHECK_PARTIAL is not only not the same thing as > UNIQUE_CHECK_SPECULATIVE (a new constant for the enum). It's also > naturally mutually exclusive with it (since we do not and cannot > support deferred unique constraints as arbiters). Let's represent this > directly. > > * It makes a conflict not detected by the pre-check always insert an > index tuple, even though that occurs after a point where it's already > been established that the pointed-to TID is doomed -- it must go on to > be super deleted. Why bother bloating the index? > > > I'm actually not really motivated by wanting to reduce bloat here > (that was always something that I thought was a non-issue with *any* > implemented speculative insertion prototype [1]). Rather, by actually > physically inserting an index tuple unnecessarily, the implication is > that it makes sense to do so (perhaps for roughly the same reason it > makes sense with deferred unique constraints, or some other > complicated and subtle reason.). AFAICT that implication is incorrect, > though; I see no reason why inserting that index tuple serves any > purpose, and it clearly *can* be avoided with little effort. I agree it would be cleaner to have a separate CHECK_UNIQUE_XXX code for speculative insertions. You've defined CHECK_UNIQUE_SPECULATIVE as "like CHECK_UNIQUE_PARTIAL, but you don't have to insert the index tuple if there's a conflict". I think it'd be better to define it as "like CHECK_UNIQUE_YES, but return FALSE instead of throwing an error on conflict". The difference is that the aminsert would not be allowed to return FALSE when there is no conflict. Actually, even without this patch, a dummy implementation of aminsert that always returns FALSE in CHECK_UNIQUE_PARTIAL mode would be legal per the docs, but would loop forever. So that would be nice to fix or document away, even though it's not a problem for B-tree currently. > Attached patch updates speculative insertion along these lines. > > In passing, I have make ExecInsertIndexTuples() give up when a > speculative insertion conflict is detected. Again, this is not about > bloat prevention; it's about making the code easier to understand by > not doing something that is unnecessary, and in avoiding that also > avoiding the implication that it is necessary. There are already > enough complicated interactions that *are* necessary (e.g. "livelock > avoidance" for exclusion constraints). Let us make our intent clearer. Hmm, not sure I like those changes. ExecInsertIndexTuples's behaviour now depends on whether specConflict is passed, but that seems weird as specConflict is an output parameter. > The patch also updates the AM interface documentation (the part > covering unique indexes). It seems quite natural to me to document the > theory of operation for speculative insertion there. Yeah, although I find the explanation pretty long-winded and difficult to understand ;-). - Heikki
On Tue, Jun 30, 2015 at 11:19 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > I agree it would be cleaner to have a separate CHECK_UNIQUE_XXX code for > speculative insertions. You've defined CHECK_UNIQUE_SPECULATIVE as "like > CHECK_UNIQUE_PARTIAL, but you don't have to insert the index tuple if > there's a conflict". I think it'd be better to define it as "like > CHECK_UNIQUE_YES, but return FALSE instead of throwing an error on > conflict". The difference is that the aminsert would not be allowed to > return FALSE when there is no conflict. Suppose we do it that way. Then what's the difference between CHECK_UNIQUE_SPECULATIVE and CHECK_UNIQUE_PARTIAL? You've just effectively required the CHECK_UNIQUE_YES case to not physically insert a physical tuple before throwing an error, which does not seem essential to the existing definition of CHECK_UNIQUE_YES -- you've redefined CHECK_UNIQUE_YES in a way that nbtree happens to meet at the moment. If we had an amcanunique AM that worked a bit like exclusion constraints, this new obligation for CHECK_UNIQUE_YES might make it impossible for that to work. I'm not necessarily in disagreement. I just didn't do it that way for that reason. > Actually, even without this patch, a dummy implementation of aminsert that > always returns FALSE in CHECK_UNIQUE_PARTIAL mode would be legal per the > docs, but would loop forever. So that would be nice to fix or document away, > even though it's not a problem for B-tree currently. UNIQUE_CHECK_PARTIAL calls to an aminsert function are allowed to report false positives (of not being unique, by returning FALSE as you say), where there may not have been a conflict, but it's not clear (e.g. because the conflicting xact has yet to commit/abort). You still need to insert, though, so that the recheck at end-of-xact works for deferred constraints. At that point, it's really not too different to ordinary unique enforcement, except that the other guy is guaranteed to notice you, too. You can construct a theoretical case where lock starvation occurs with unique constraint enforcement. I think it helps with nbtree here that someone will reliably *not* see a conflict when concurrently inserting, because unique index "value locking" exists (by locking the first leaf page a value could be on with a buffer lock). But even if that wasn't the case, the insert + recheck thing would be safe, just as with exclusion constraints...provided you insert to begin with, that is. >> In passing, I have make ExecInsertIndexTuples() give up when a >> speculative insertion conflict is detected. Again, this is not about >> bloat prevention; it's about making the code easier to understand by >> not doing something that is unnecessary, and in avoiding that also >> avoiding the implication that it is necessary. There are already >> enough complicated interactions that *are* necessary (e.g. "livelock >> avoidance" for exclusion constraints). Let us make our intent clearer. > > Hmm, not sure I like those changes. ExecInsertIndexTuples's behaviour now > depends on whether specConflict is passed, but that seems weird as > specConflict is an output parameter. Your statement is ambiguous and confusing to me. Are you objecting to the idea of returning from ExecInsertIndexTuples() "early" in general, or to one narrow aspect of how that was proposed -- the fact that it occurs due to "specConflict != NULL" rather than "noDupErr"? Obviously if it's just the latter, then that's a small adjustment. But AFAICT your remark about specConflict could one detail of a broader complaint about an idea that you dislike generally. >> The patch also updates the AM interface documentation (the part >> covering unique indexes). It seems quite natural to me to document the >> theory of operation for speculative insertion there. > > > Yeah, although I find the explanation pretty long-winded and difficult to > understand ;-). I don't know what you mean. You say things like this to me fairly regularly, but I'm always left wondering what the take-away should be. Please be more specific. Long-winded generally means that more words than necessary were used. I think that the documentation proposed is actually very information dense, if anything. As always, I aimed to keep it consistent with the surrounding documentation. I understand that the material might be a little hard to follow, but it concerns how someone can externally implement a Postgres index access method that is "amcanunique". That's a pretty esoteric subject -- so far, we've had exactly zero takers (even without the "amcanunique" part). It's damn hard to make these ideas accessible. I feel that I have an obligation to share information about (say) how the speculative insertion mechanism works -- the "theory of operation" seems very useful. Like any one of us, I might not be around to provide input on something like that in the future, so it makes sense to be thorough and unambiguous. There are very few people (3?) who have a good sense of how it works currently, and there are some very subtle aspects to it that I tried to point out. -- Peter Geoghegan
On Wed, Jul 1, 2015 at 10:45 AM, Peter Geoghegan <pg@heroku.com> wrote: > You can construct a theoretical case where lock starvation occurs with > unique constraint enforcement. I think it helps with nbtree here that > someone will reliably *not* see a conflict when concurrently > inserting, because unique index "value locking" exists (by locking the > first leaf page a value could be on with a buffer lock). But even if > that wasn't the case, the insert + recheck thing would be safe, just > as with exclusion constraints...provided you insert to begin with, > that is. Of course, the fact that UNIQUE_CHECK_PARTIAL aminsert callers (including deferred constraint inserters and, currently, speculative inserters) can rely on this is very useful to UPSERT. It guarantees progress of one session without messy exclusion constraint style fixes (for deadlock and livelock avoidance). You cannot talk about speculative insertion without talking about unprincipled deadlocks (and maybe livelocks). If I had to bet where we might find some bugs in the executor parts of UPSERT, my first guess would be the exclusion constraint edge-case handling (livelocking stuff). Those are probably relatively benign bugs, but recent bugs in exclusion constraints (in released branches) show that they can hide for a long time. -- Peter Geoghegan
On Wed, Jul 1, 2015 at 10:45 AM, Peter Geoghegan <pg@heroku.com> wrote: > On Tue, Jun 30, 2015 at 11:19 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> I agree it would be cleaner to have a separate CHECK_UNIQUE_XXX code for >> speculative insertions. You've defined CHECK_UNIQUE_SPECULATIVE as "like >> CHECK_UNIQUE_PARTIAL, but you don't have to insert the index tuple if >> there's a conflict". I think it'd be better to define it as "like >> CHECK_UNIQUE_YES, but return FALSE instead of throwing an error on >> conflict". The difference is that the aminsert would not be allowed to >> return FALSE when there is no conflict. > > Suppose we do it that way. Then what's the difference between > CHECK_UNIQUE_SPECULATIVE and CHECK_UNIQUE_PARTIAL? You've just > effectively required the CHECK_UNIQUE_YES case to not physically > insert a physical tuple before throwing an error, which does not seem > essential to the existing definition of CHECK_UNIQUE_YES -- you've > redefined CHECK_UNIQUE_YES in a way that nbtree happens to meet at the > moment. If we had an amcanunique AM that worked a bit like exclusion > constraints, this new obligation for CHECK_UNIQUE_YES might make it > impossible for that to work. Another more obvious and important thing: CHECK_UNIQUE_YES waits for conflicts to be resolved before returning to its caller. If you don't get an error, you're done. CHECK_UNIQUE_PARTIAL never waits, and if we add a CHECK_UNIQUE_SPECULATIVE, it ought to not wait either. Sure, if a speculative inserter detects a conflict, it still has to wait. But not in the aminsert call, and not until it cleans up its physical insertion (by super-deleting). Clearly a CHECK_UNIQUE_SPECULATIVE would have to be much closer to CHECK_UNIQUE_PARTIAL than to CHECK_UNIQUE_YES. -- Peter Geoghegan
On 07/01/2015 09:19 PM, Peter Geoghegan wrote: > On Wed, Jul 1, 2015 at 10:45 AM, Peter Geoghegan <pg@heroku.com> wrote: >> On Tue, Jun 30, 2015 at 11:19 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >>> I agree it would be cleaner to have a separate CHECK_UNIQUE_XXX code for >>> speculative insertions. You've defined CHECK_UNIQUE_SPECULATIVE as "like >>> CHECK_UNIQUE_PARTIAL, but you don't have to insert the index tuple if >>> there's a conflict". I think it'd be better to define it as "like >>> CHECK_UNIQUE_YES, but return FALSE instead of throwing an error on >>> conflict". The difference is that the aminsert would not be allowed to >>> return FALSE when there is no conflict. >> >> Suppose we do it that way. Then what's the difference between >> CHECK_UNIQUE_SPECULATIVE and CHECK_UNIQUE_PARTIAL? You've just >> effectively required the CHECK_UNIQUE_YES case to not physically >> insert a physical tuple before throwing an error, which does not seem >> essential to the existing definition of CHECK_UNIQUE_YES -- you've >> redefined CHECK_UNIQUE_YES in a way that nbtree happens to meet at the >> moment. If we had an amcanunique AM that worked a bit like exclusion >> constraints, this new obligation for CHECK_UNIQUE_YES might make it >> impossible for that to work. > > Another more obvious and important thing: CHECK_UNIQUE_YES waits for > conflicts to be resolved before returning to its caller. If you don't > get an error, you're done. CHECK_UNIQUE_PARTIAL never waits, and if we > add a CHECK_UNIQUE_SPECULATIVE, it ought to not wait either. > > Sure, if a speculative inserter detects a conflict, it still has to > wait. But not in the aminsert call, and not until it cleans up its > physical insertion (by super-deleting). Clearly a > CHECK_UNIQUE_SPECULATIVE would have to be much closer to > CHECK_UNIQUE_PARTIAL than to CHECK_UNIQUE_YES. Why is it not OK for aminsert to do the waiting, with CHECK_UNIQUE_SPECULATIVE? - Heikki
On Thu, Jul 2, 2015 at 1:30 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> Sure, if a speculative inserter detects a conflict, it still has to >> wait. But not in the aminsert call, and not until it cleans up its >> physical insertion (by super-deleting). Clearly a >> CHECK_UNIQUE_SPECULATIVE would have to be much closer to >> CHECK_UNIQUE_PARTIAL than to CHECK_UNIQUE_YES. > > > Why is it not OK for aminsert to do the waiting, with > CHECK_UNIQUE_SPECULATIVE? Well, waiting means getting a ShareLock on the other session's XID. You can't do that without first releasing your locks, unless you're okay with unprincipled deadlocks. -- Peter Geoghegan
On Thu, Jul 2, 2015 at 10:49 AM, Peter Geoghegan <pg@heroku.com> wrote: > Well, waiting means getting a ShareLock on the other session's XID. > You can't do that without first releasing your locks, unless you're > okay with unprincipled deadlocks. Besides, if the other session wins the race and inserts a physical heap tuple, but then aborts its xact, we might have to insert again (although not before super-deleting), with that insert going on to be successful. As with row locking, with insertion, if there is a conflict, any outcome (UPDATE or INSERT) is then possible. -- Peter Geoghegan
On Thu, Jul 2, 2015 at 2:58 PM, Peter Geoghegan <pg@heroku.com> wrote: > As with row locking, with insertion, if there is a conflict, any > outcome (UPDATE or INSERT) is then possible. Where are we on this? It would be nice to get this out of the way before a 9.5 beta is put out. Thanks -- Peter Geoghegan
Hi, On 2015-06-10 16:19:27 -0700, Peter Geoghegan wrote: > Currently, speculative insertion (the INSERT ... ON CONFLICT DO UPDATE > executor/storage infrastructure) uses checkUnique == > UNIQUE_CHECK_PARTIAL for unique indexes, which is a constant > originally only used by deferred unique constraints. It occurred to me > that this has a number of disadvantages: > ... > Attached patch updates speculative insertion along these lines. > ... > The patch also updates the AM interface documentation (the part > covering unique indexes). It seems quite natural to me to document the > theory of operation for speculative insertion there. I'm not arguing against any of this - but I don't think this needs to be on the 9.5 open items list. I plan to remove from there. Greetings, Andres Freund
On Sat, Oct 3, 2015 at 6:38 AM, Andres Freund <andres@anarazel.de> wrote: > I'm not arguing against any of this - but I don't think this needs to be > on the 9.5 open items list. I plan to remove from there. Obviously I don't think that this is a critical fix. I do think that it would be nice to keep the branches in sync, and that might become a bit more difficult after 9.5 is released. -- Peter Geoghegan
On Sun, Oct 4, 2015 at 2:07 AM, Peter Geoghegan <pg@heroku.com> wrote: > On Sat, Oct 3, 2015 at 6:38 AM, Andres Freund <andres@anarazel.de> wrote: >> I'm not arguing against any of this - but I don't think this needs to be >> on the 9.5 open items list. I plan to remove from there. > > Obviously I don't think that this is a critical fix. I do think that > it would be nice to keep the branches in sync, and that might become a > bit more difficult after 9.5 is released. (A couple of months later) This is not an actual fix, but an optimization, no? UNIQUE_CHECK_SPECULATIVE is just used to optimize a couple of code paths in the case of a insert conflicting during btree insertion.. In any case, at this point 9.5 is really aimed to be stabilized, so targeting only master is a far saner approach IMO for this patch. Pushing that in 9.5 a couple of months back may have given enough reason to do so... But well life is life. + * it later Missing a dot here :) + * Set checkedIndex here, since partial unique index will still count + * as a found arbiter index despite being skipped due to predicate not + * being satisfied Ditto. -- Michael
On Wed, Dec 16, 2015 at 11:20 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > (A couple of months later) > This is not an actual fix, but an optimization, no? > UNIQUE_CHECK_SPECULATIVE is just used to optimize a couple of code > paths in the case of a insert conflicting during btree insertion.. > > In any case, at this point 9.5 is really aimed to be stabilized, so > targeting only master is a far saner approach IMO for this patch. > Pushing that in 9.5 a couple of months back may have given enough > reason to do so... But well life is life. No, this really isn't an optimization at all. It has nothing to do with performance, since I think that unnecessarily inserting an index tuple in practice has very little or no appreciable overhead. The point of avoiding that is that it makes no sense, while doing it implies that it does make sense. (i.e. It does not make sense to insert into a B-Tree leaf page an IndexTuple pointing to a speculative heap tuple with the certain knowledge that we'll have to super-delete the speculative heap tuple in the end anyway). This is 100% about clarifying the intent and design of the code. -- Peter Geoghegan
On Wed, Dec 16, 2015 at 11:44 PM, Peter Geoghegan <pg@heroku.com> wrote: >> In any case, at this point 9.5 is really aimed to be stabilized, so >> targeting only master is a far saner approach IMO for this patch. >> Pushing that in 9.5 a couple of months back may have given enough >> reason to do so... But well life is life. > > No, this really isn't an optimization at all. I should add: I think that the chances of this patch destabilizing the code are very slim, once it receives the proper review. Certainly, I foresee no possible downside to not inserting the doomed IndexTuple, since it's guaranteed to have its heap tuple super-deleted immediately afterwards. That's the only real behavioral change proposed here. So, I would prefer it if we got this in before the first stable release of 9.5. -- Peter Geoghegan
On Thu, Dec 17, 2015 at 4:55 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Wed, Dec 16, 2015 at 11:44 PM, Peter Geoghegan <pg@heroku.com> wrote: >>> In any case, at this point 9.5 is really aimed to be stabilized, so >>> targeting only master is a far saner approach IMO for this patch. >>> Pushing that in 9.5 a couple of months back may have given enough >>> reason to do so... But well life is life. >> >> No, this really isn't an optimization at all. > > I should add: I think that the chances of this patch destabilizing the > code are very slim, once it receives the proper review. Certainly, I > foresee no possible downside to not inserting the doomed IndexTuple, > since it's guaranteed to have its heap tuple super-deleted immediately > afterwards. I am no committer, just a guy giving an opinion about this patch and I have to admit that I have not enough studied the speculative insertion code to have a clear technical point of view on the matter, but even if the chances of destabilizing the code are slim, it does not seem a wise idea to me to do that post-rc1 and before a GA as there are people testing the code as it is now. Now per the two points listed in the first sentence in this paragraph, perhaps this opinion is fine as moot :) I didn't get much involved in the development of this code after all. > That's the only real behavioral change proposed here. So, I would > prefer it if we got this in before the first stable release of 9.5. Yeah, I got this one. -- Michael
On Thu, Dec 17, 2015 at 11:06 PM, Michael Paquier <michael.paquier@gmail.com> wrote: >> I should add: I think that the chances of this patch destabilizing the >> code are very slim, once it receives the proper review. Certainly, I >> foresee no possible downside to not inserting the doomed IndexTuple, >> since it's guaranteed to have its heap tuple super-deleted immediately >> afterwards. > > I am no committer, just a guy giving an opinion about this patch and I > have to admit that I have not enough studied the speculative insertion > code to have a clear technical point of view on the matter, but even > if the chances of destabilizing the code are slim, it does not seem a > wise idea to me to do that post-rc1 and before a GA as there are > people testing the code as it is now. You can express doubt about almost anything. In this case, you haven't voiced any particular concern about any particular risk. The risk of not proceeding is that 9.5 will remain in a divergent state for no reason, with substantial differences in the documentation of the AM interface, and that has a cost. Why should the risk of destabilizing 9.5 become more palatable when 9.5 has been out for 6 months or a year? You can take it that this will probably be now-or-never. This is mostly just comment changes, and changes to documentation. If it comes down to it, we could leave the existing 9.5 code intact (i.e. still unnecessarily insert the IndexTuple), while commenting that it is unnecessary, and still changing everything else. That would have an unquantifiably tiny risk, certainly smaller than the risk of committing the patch as-is (which, to reiterate, is a risk that I think is very small). FWIW, I tend to doubt that users have tested speculative insertion/ON CONFLICT much this whole time. There were a couple of bug reports, but that's it. > Now per the two points listed in > the first sentence in this paragraph, perhaps this opinion is fine as > moot :) I didn't get much involved in the development of this code > after all. I'm concerned that Heikki's apparent unavailability will become a blocker to this. -- Peter Geoghegan
On Fri, Dec 18, 2015 at 4:31 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Thu, Dec 17, 2015 at 11:06 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> Now per the two points listed in >> the first sentence in this paragraph, perhaps this opinion is fine as >> moot :) I didn't get much involved in the development of this code >> after all. > > I'm concerned that Heikki's apparent unavailability will become a > blocker to this. Yeah, not only this patch... The second committer with enough background on the matter is Andres. -- Michael
On Thu, Dec 17, 2015 at 2:55 AM, Peter Geoghegan <pg@heroku.com> wrote: > On Wed, Dec 16, 2015 at 11:44 PM, Peter Geoghegan <pg@heroku.com> wrote: >>> In any case, at this point 9.5 is really aimed to be stabilized, so >>> targeting only master is a far saner approach IMO for this patch. >>> Pushing that in 9.5 a couple of months back may have given enough >>> reason to do so... But well life is life. >> >> No, this really isn't an optimization at all. > > I should add: I think that the chances of this patch destabilizing the > code are very slim, once it receives the proper review. Certainly, I > foresee no possible downside to not inserting the doomed IndexTuple, > since it's guaranteed to have its heap tuple super-deleted immediately > afterwards. > > That's the only real behavioral change proposed here. So, I would > prefer it if we got this in before the first stable release of 9.5. No, it's far too late to be pushing this into 9.5. We are at RC1 now and hoping to cut a final release right after Christmas. I think it's quite wrong to argue that these changes have no risk of destabilizing 9.5. Nobody is exempt from having bugs in their code - not me, not you, not Tom Lane. But quite apart from that, there seems to be no compelling benefit to having these changes in 9.5. You say that the branches will diverge needlessly, but the whole point of having branches is that we do need things to diverge. The question isn't "why shouldn't these go into 9.5?" but "do these fix something that is clearly broken in 9.5 and must be fixed to avoid hurting users?". Andres has said clearly that he doesn't think so, and Heikki didn't seem convinced that we wanted the changes at all. I've read over the thread and I think that even if all the good things you say about this patch are 100% true, it doesn't amount to a good reason to back-patch. Code that does something possibly non-sensical or sub-optimal isn't a reason to back-patch in the absence of a clear, user-visible consequence. I think it's a shame that we haven't gotten this patch dealt with just because when somebody submits a patch in June, it's not very nice for it to still be pending in December, but since this stuff is even further outside my area of expertise than the sorting stuff, and since me and my split personalities only have so many hours in the day, I'm going to have to leave it to somebody else to pick up anyhow. But that's a separate issue from whether this should be back-patched. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Dec 18, 2015 at 8:58 AM, Robert Haas <robertmhaas@gmail.com> wrote: > No, it's far too late to be pushing this into 9.5. We are at RC1 now > and hoping to cut a final release right after Christmas. I think it's > quite wrong to argue that these changes have no risk of destabilizing > 9.5. Nobody is exempt from having bugs in their code - not me, not > you, not Tom Lane. But quite apart from that, there seems to be no > compelling benefit to having these changes in 9.5. You say that the > branches will diverge needlessly, but the whole point of having > branches is that we do need things to diverge. The question isn't > "why shouldn't these go into 9.5?" but "do these fix something that is > clearly broken in 9.5 and must be fixed to avoid hurting users?". > Andres has said clearly that he doesn't think so, and Heikki didn't > seem convinced that we wanted the changes at all. It isn't true that Heikki was not basically in favor of this. This should have been committed as part of the original patch, really. I hope to avoid needless confusion about the documented (by the official documentation) AM interface. Yes, that is > I think it's a shame that we haven't gotten this patch dealt with just > because when somebody submits a patch in June, it's not very nice for > it to still be pending in December, but since this stuff is even > further outside my area of expertise than the sorting stuff, and since > me and my split personalities only have so many hours in the day, I'm > going to have to leave it to somebody else to pick up anyhow. But > that's a separate issue from whether this should be back-patched. Note that I've already proposed a compromise, even though I don't think my original position was at all unreasonable. There'd be zero real changes (only the addition of the new constant name, documentation updates, comment updates, etc) under that compromise (as against one change.). -- Peter Geoghegan
On Fri, Dec 18, 2015 at 2:04 PM, Peter Geoghegan <pg@heroku.com> wrote: > It isn't true that Heikki was not basically in favor of this. This > should have been committed as part of the original patch, really. Maybe he wasn't against the whole thing, but he's posted two messages to this thread and they can't be read as unequivocally in favor of these changes. He clearly didn't like at least some of it. > I hope to avoid needless confusion about the documented (by the > official documentation) AM interface. Yes, that is Something maybe got cut off here? >> I think it's a shame that we haven't gotten this patch dealt with just >> because when somebody submits a patch in June, it's not very nice for >> it to still be pending in December, but since this stuff is even >> further outside my area of expertise than the sorting stuff, and since >> me and my split personalities only have so many hours in the day, I'm >> going to have to leave it to somebody else to pick up anyhow. But >> that's a separate issue from whether this should be back-patched. > > Note that I've already proposed a compromise, even though I don't > think my original position was at all unreasonable. There'd be zero > real changes (only the addition of the new constant name, > documentation updates, comment updates, etc) under that compromise (as > against one change.). I only see one patch version on the thread. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Dec 18, 2015 at 12:55 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Dec 18, 2015 at 2:04 PM, Peter Geoghegan <pg@heroku.com> wrote: >> It isn't true that Heikki was not basically in favor of this. This >> should have been committed as part of the original patch, really. > > Maybe he wasn't against the whole thing, but he's posted two messages > to this thread and they can't be read as unequivocally in favor of > these changes. He clearly didn't like at least some of it. The issues were very trivial. > I only see one patch version on the thread. I'm not going to post a revision until I thrash out the tiny issues with Heikki. He kind of trailed off. So maybe that kills it immediately, which is a shame. -- Peter Geoghegan
On Sat, Dec 19, 2015 at 1:58 AM, Robert Haas wrote: > No, it's far too late to be pushing this into 9.5. We are at RC1 now > and hoping to cut a final release right after Christmas. I think it's > quite wrong to argue that these changes have no risk of destabilizing > 9.5. Nobody is exempt from having bugs in their code - not me, not > you, not Tom Lane. But quite apart from that, there seems to be no > compelling benefit to having these changes in 9.5. You say that the > branches will diverge needlessly, but the whole point of having > branches is that we do need things to diverge. The question isn't > "why shouldn't these go into 9.5?" but "do these fix something that is > clearly broken in 9.5 and must be fixed to avoid hurting users?". > Andres has said clearly that he doesn't think so, and Heikki didn't > seem convinced that we wanted the changes at all. I've read over the > thread and I think that even if all the good things you say about this > patch are 100% true, it doesn't amount to a good reason to back-patch. > Code that does something possibly non-sensical or sub-optimal isn't a > reason to back-patch in the absence of a clear, user-visible > consequence. +1. There are folks around doing tests using 9.5 now, it is not correct to impact the effort they have been putting on it until now. > I think it's a shame that we haven't gotten this patch dealt with just > because when somebody submits a patch in June, it's not very nice for > it to still be pending in December, but since this stuff is even > further outside my area of expertise than the sorting stuff, and since > me and my split personalities only have so many hours in the day, I'm > going to have to leave it to somebody else to pick up anyhow. But > that's a separate issue from whether this should be back-patched. Many patches wait in the queue for months, that's not new. Some of them wait even longer than that. For those reasons, and because you are willing visibly to still work on it, I am moving this patch to next CF. Regards, -- Michael
On Sat, Dec 19, 2015 at 3:26 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > +1. There are folks around doing tests using 9.5 now, it is not > correct to impact the effort they have been putting on it until now. This is a total misrepresentation of what I've said. -- Peter Geoghegan
Peter Geoghegan wrote: > On Fri, Dec 18, 2015 at 12:55 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > I only see one patch version on the thread. > > I'm not going to post a revision until I thrash out the tiny issues > with Heikki. He kind of trailed off. So maybe that kills it > immediately, which is a shame. If you refuse to post an updated version of the patch until Heikki weighs in some more, and given that Heikki has (for the purposes of this patch) completely vanished, I think we should mark this rejected. If somebody else is open to reviewing the patch, I think that'd be another way to move forward, but presumably they would start from a version with the discussed changes already fixed. Otherwise it's a waste of time. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Jan 18, 2016 at 8:36 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > If you refuse to post an updated version of the patch until Heikki > weighs in some more, and given that Heikki has (for the purposes of this > patch) completely vanished, I think we should mark this rejected. I don't refuse. I just don't want to waste anyone's time. I will follow all of Heikki's feedback immediately, except this: "I think it'd be better to define it as "like CHECK_UNIQUE_YES, but return FALSE instead of throwing an error on conflict". The difference is that the aminsert would not be allowed to return FALSE when there is no conflict". That's because I believe this is quite broken, as already pointed out. > If somebody else is open to reviewing the patch, I think that'd be > another way to move forward, but presumably they would start from a > version with the discussed changes already fixed. Otherwise it's a > waste of time. Your premise here is that what Heikki said in passing months ago is incontrovertibly the right approach. That's ridiculous. I think Heikki and I could work this out quite quickly, if he engaged, but for whatever reason he appears unable to. I doubt that Heikki thinks that about what he said, so why do you? The point about CHECK_UNIQUE_YES I highlighted above felt like a temporary misunderstanding to me, and not even what you might call a real disagreement. It wasn't as if Heikki was insistent at the time. I pointed out that what he said was broken according to an established definition of broken (it would result in unprincipled deadlocks). He didn't respond to that point. I think he didn't get back quickly in part because I gave him something to think about. If any other committer wants to engage with me on this, then I will of course work with them. But that will not be predicated on my first revising the patch in a way that this other committer does not understand. That would be profoundly unfair. -- Peter Geoghegan
Peter Geoghegan wrote: > On Mon, Jan 18, 2016 at 8:36 AM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: > > If you refuse to post an updated version of the patch until Heikki > > weighs in some more, and given that Heikki has (for the purposes of this > > patch) completely vanished, I think we should mark this rejected. > > I don't refuse. I just don't want to waste anyone's time. I will > follow all of Heikki's feedback immediately, except this: > > "I think it'd be better to define it as "like CHECK_UNIQUE_YES, but > return FALSE instead of throwing an error on conflict". The difference > is that the aminsert would not be allowed to return FALSE when there > is no conflict". > > That's because I believe this is quite broken, as already pointed out. I think I like your approach better. > > If somebody else is open to reviewing the patch, I think that'd be > > another way to move forward, but presumably they would start from a > > version with the discussed changes already fixed. Otherwise it's a > > waste of time. > > Your premise here is that what Heikki said in passing months ago is > incontrovertibly the right approach. That's ridiculous. I think Heikki > and I could work this out quite quickly, if he engaged, but for > whatever reason he appears unable to. I doubt that Heikki thinks that > about what he said, so why do you? I don't -- I just think you could have sent a patch that addressed all the other points, leave this one as initially submitted, and note that the new submission left it unaddressed because you disagreed. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Jan 18, 2016 at 11:56 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >> That's because I believe this is quite broken, as already pointed out. > > I think I like your approach better. That makes things far simpler, then. >> Your premise here is that what Heikki said in passing months ago is >> incontrovertibly the right approach. That's ridiculous. I think Heikki >> and I could work this out quite quickly, if he engaged, but for >> whatever reason he appears unable to. I doubt that Heikki thinks that >> about what he said, so why do you? > > I don't -- I just think you could have sent a patch that addressed all > the other points, leave this one as initially submitted, and note that > the new submission left it unaddressed because you disagreed. I'll try to do that soon. I've got a very busy schedule over the next couple of weeks, though. -- Peter Geoghegan
On Mon, Jan 18, 2016 at 3:14 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Mon, Jan 18, 2016 at 11:56 AM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: >>> That's because I believe this is quite broken, as already pointed out. >> >> I think I like your approach better. > > That makes things far simpler, then. > >>> Your premise here is that what Heikki said in passing months ago is >>> incontrovertibly the right approach. That's ridiculous. I think Heikki >>> and I could work this out quite quickly, if he engaged, but for >>> whatever reason he appears unable to. I doubt that Heikki thinks that >>> about what he said, so why do you? >> >> I don't -- I just think you could have sent a patch that addressed all >> the other points, leave this one as initially submitted, and note that >> the new submission left it unaddressed because you disagreed. > > I'll try to do that soon. I've got a very busy schedule over the next > couple of weeks, though. This patch was reviewed during CF 2016-01 and has not been updated for CF 2016-03. I think we should mark it Returned with Feedback. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Mar 11, 2016 at 5:26 AM, Robert Haas <robertmhaas@gmail.com> wrote: > This patch was reviewed during CF 2016-01 and has not been updated for > CF 2016-03. I think we should mark it Returned with Feedback. I have a full plate at the moment, Robert, both as a reviewer and as a patch author. This patch is basically uncontroversial, and is built to make the AM interface clearer, and the design of speculative insertion easier to understand. It's clear we should have it. I'll get around to revising it before too long. -- Peter Geoghegan
On Sat, Mar 12, 2016 at 11:24 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Fri, Mar 11, 2016 at 5:26 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> This patch was reviewed during CF 2016-01 and has not been updated for >> CF 2016-03. I think we should mark it Returned with Feedback. > > I have a full plate at the moment, Robert, both as a reviewer and as a > patch author. This patch is basically uncontroversial, and is built to > make the AM interface clearer, and the design of speculative insertion > easier to understand. It's clear we should have it. I'll get around to > revising it before too long. Only one version of this patch has been sent at the beginning of this thread, and Heikki has clearly expressed his disagreement about at least a portion of it at the beginning of this thread, so I find it hard to define it as an "uncontroversial" thing and something that is clear to have as things stand. Seeing a new version soon would be a good next step I guess. -- Michael
On Sat, Mar 12, 2016 at 2:43 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > Only one version of this patch has been sent at the beginning of this > thread, and Heikki has clearly expressed his disagreement about at > least a portion of it at the beginning of this thread, so I find it > hard to define it as an "uncontroversial" thing and something that is > clear to have as things stand. Seeing a new version soon would be a > good next step I guess. What is the point in saying this, Michael? What purpose does it serve? I said "basically uncontroversial", not "uncontroversial". That is a perfectly accurate characterization of the patch, and if you disagree than I suggest you re-read the thread. Andres and Heikki were both in favor of this patch. Heikki and I discussed one particular aspect of it, and then it trailed off. The only thing that Heikki categorically stated was that he disliked one narrow aspect of the style of one thing in one function. I've already said I'm happy to do that. As things stand, the documentation for amcanunique methods, and the way they are described internally, is fairly misleading. -- Peter Geoghegan
On Sat, Mar 12, 2016 at 2:53 PM, Peter Geoghegan <pg@heroku.com> wrote: > I said "basically uncontroversial", not "uncontroversial". That is a > perfectly accurate characterization of the patch, and if you disagree > than I suggest you re-read the thread. In particular, note that Alvaro eventually sided with me against the thing that Heikki argued for: http://www.postgresql.org/message-id/20160118195643.GA117199@alvherre.pgsql Describing what happened that way is unfair on Heikki, because I don't think he was at all firm in what he said about making the new UNIQUE_CHECK_SPECULATIVE "like CHECK_UNIQUE_YES, but return FALSE instead of throwing an error on conflict". We were working through the design, and it didn't actually come to any kind of impasse. It's surprising and disappointing to me that this supposed disagreement has been blown out of all proportion. -- Peter Geoghegan
On Sat, Mar 12, 2016 at 5:53 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Sat, Mar 12, 2016 at 2:43 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> Only one version of this patch has been sent at the beginning of this >> thread, and Heikki has clearly expressed his disagreement about at >> least a portion of it at the beginning of this thread, so I find it >> hard to define it as an "uncontroversial" thing and something that is >> clear to have as things stand. Seeing a new version soon would be a >> good next step I guess. > > What is the point in saying this, Michael? What purpose does it serve? Gee, I think Michael is right on target. What purpose does writing him an email that sounds annoyed serve? There hasn't been a new version of this patch in 9 months, you're clearly not in a hurry to produce one, and nobody else seems to feel strongly that this is something that needs to be done at all. I think we could just let this go and be just fine, but instead of doing that and moving onto patches that people do feel strongly about, we're arguing about this. Bummer. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Mar 14, 2016 at 5:04 PM, Robert Haas <robertmhaas@gmail.com> wrote: > There hasn't been a new version of this patch in 9 months, you're > clearly not in a hurry to produce one, and nobody else seems to feel > strongly that this is something that needs to be done at all. I think > we could just let this go and be just fine, but instead of doing that > and moving onto patches that people do feel strongly about, we're > arguing about this. Bummer. I'm busy working on fixing an OpenSSL bug affecting all released versions right at the moment, but have a number of complex 9.6 patches to review, most of which are in need of support. I'm very busy. I said that I'd get to this patch soon. I might be kicking the can down the road a little with this patch; if so, I'm sorry. I suggest we leave it at that, until the CF is almost over or until I produce a revision. -- Peter Geoghegan
On 2016-03-14 17:17:02 -0700, Peter Geoghegan wrote: > On Mon, Mar 14, 2016 at 5:04 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > There hasn't been a new version of this patch in 9 months, you're > > clearly not in a hurry to produce one, and nobody else seems to feel > > strongly that this is something that needs to be done at all. I think > > we could just let this go and be just fine, but instead of doing that > > and moving onto patches that people do feel strongly about, we're > > arguing about this. Bummer. > > I'm busy working on fixing an OpenSSL bug affecting all released > versions right at the moment, but have a number of complex 9.6 patches > to review, most of which are in need of support. I'm very busy. So? You're not the only one. I don't see why we shouldn't move this to 'returned with feedback' until there's a new version. Andres
On Mon, Mar 14, 2016 at 5:21 PM, Andres Freund <andres@anarazel.de> wrote: > So? You're not the only one. I don't see why we shouldn't move this to > 'returned with feedback' until there's a new version. I don't see any point in that; I intend to get a revision in to the ongoing CF. But fine. -- Peter Geoghegan
On Mon, Mar 14, 2016 at 8:17 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Mon, Mar 14, 2016 at 5:04 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> There hasn't been a new version of this patch in 9 months, you're >> clearly not in a hurry to produce one, and nobody else seems to feel >> strongly that this is something that needs to be done at all. I think >> we could just let this go and be just fine, but instead of doing that >> and moving onto patches that people do feel strongly about, we're >> arguing about this. Bummer. > > I'm busy working on fixing an OpenSSL bug affecting all released > versions right at the moment, but have a number of complex 9.6 patches > to review, most of which are in need of support. I'm very busy. > > I said that I'd get to this patch soon. I might be kicking the can > down the road a little with this patch; if so, I'm sorry. I suggest we > leave it at that, until the CF is almost over or until I produce a > revision. Sure, and if everybody does that, then there will be 40 patches that get updated in the last 2 days if the CommitFest, and that will be impossible. Come on. You're demanding a degree of preferential treatment which is unsupportable. You're also assuming that anybody's going to be willing to commit that revision when you produce it, which looks to me like an unproven hypothesis. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Mar 16, 2016 at 11:25 AM, Robert Haas <robertmhaas@gmail.com> wrote: > Sure, and if everybody does that, then there will be 40 patches that > get updated in the last 2 days if the CommitFest, and that will be > impossible. Come on. You're demanding a degree of preferential > treatment which is unsupportable. It's unexpected that an entirely maintenance-orientated patch like this would be received this way. I'm not demanding anything, or applying any real pressure. Let's just get on with it. I attach a revision, that makes all the changes that Heikki suggested, except one. As already noted several times, following this suggestion would have added a bug. Alvaro preferred my original approach here in any case. I refer to my original approach of making the new UNIQUE_CHECK_SPECULATIVE case minimally different from the existing UNIQUE_CHECK_PARTIAL case currently used for deferred unique constraints and speculative insertion, as opposed to making the new UNIQUE_CHECK_SPECULATIVE "like CHECK_UNIQUE_YES, but return FALSE instead of throwing an error on conflict". That was broken because CHECK_UNIQUE_YES waits for the outcome of an xact, which UNIQUE_CHECK_PARTIAL never does, and so UNIQUE_CHECK_SPECULATIVE must never do. Any and all waits happen in the first phase of speculative insertion, and never the seconds. I could give a complicated explanation for why, involving a deadlock scenario, but a simple explanation will do: it has always worked that way, and was tested to work that way. Feedback from Heikki led to these changes for this revision: * The use of arguments within ExecInsert() was simplified. * More concise AM documentation. The docs essentially describe two new concepts: - What unique index insertion needs to know about speculative insertion in general. This doesn't just apply to speculative inserters themselves, of course. - What speculative insertion is. Why it exists (why we don't just wait on xact). In other words, what "unprincipled deadlocks" are, and how they are avoided (from a relatively high level). I feel like I have a responsibility to make sure that this mechanism is well documented, especially given that not that many people were involved in its design. It's possible that no more than the 3 original authors of UPSERT fully understand speculative insertion -- it's easy to miss some of the subtleties. I do not pursue something like this without good reason. I'm optimistic that the patch will be accepted if it is carefully considered. -- Peter Geoghegan
Attachment
On Wed, Mar 16, 2016 at 7:43 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Wed, Mar 16, 2016 at 11:25 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> Sure, and if everybody does that, then there will be 40 patches that >> get updated in the last 2 days if the CommitFest, and that will be >> impossible. Come on. You're demanding a degree of preferential >> treatment which is unsupportable. > > It's unexpected that an entirely maintenance-orientated patch like > this would be received this way. I'm not demanding anything, or > applying any real pressure. Let's just get on with it. > > I attach a revision, that makes all the changes that Heikki suggested, > except one. As already noted several times, following this suggestion > would have added a bug. Alvaro preferred my original approach here in > any case. I refer to my original approach of making the new > UNIQUE_CHECK_SPECULATIVE case minimally different from the existing > UNIQUE_CHECK_PARTIAL case currently used for deferred unique > constraints and speculative insertion, as opposed to making the new > UNIQUE_CHECK_SPECULATIVE "like CHECK_UNIQUE_YES, but return FALSE > instead of throwing an error on conflict". That was broken because > CHECK_UNIQUE_YES waits for the outcome of an xact, which > UNIQUE_CHECK_PARTIAL never does, and so UNIQUE_CHECK_SPECULATIVE must > never do. > > Any and all waits happen in the first phase of speculative insertion, > and never the seconds. I could give a complicated explanation for why, > involving a deadlock scenario, but a simple explanation will do: it > has always worked that way, and was tested to work that way. > > Feedback from Heikki led to these changes for this revision: > > * The use of arguments within ExecInsert() was simplified. > > * More concise AM documentation. > > The docs essentially describe two new concepts: > > - What unique index insertion needs to know about speculative > insertion in general. This doesn't just apply to speculative inserters > themselves, of course. > > - What speculative insertion is. Why it exists (why we don't just wait > on xact). In other words, what "unprincipled deadlocks" are, and how > they are avoided (from a relatively high level). > > I feel like I have a responsibility to make sure that this mechanism > is well documented, especially given that not that many people were > involved in its design. It's possible that no more than the 3 original > authors of UPSERT fully understand speculative insertion -- it's easy > to miss some of the subtleties. > > I do not pursue something like this without good reason. I'm > optimistic that the patch will be accepted if it is carefully > considered. This patch has attracted no reviewers. Any volunteers? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Apr 5, 2016 at 9:47 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> I do not pursue something like this without good reason. I'm >> optimistic that the patch will be accepted if it is carefully >> considered. > > This patch has attracted no reviewers. Any volunteers? Since nobody has emerged to review this, I have moved it to the next CommitFest. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 03/17/2016 01:43 AM, Peter Geoghegan wrote: > I attach a revision, that makes all the changes that Heikki suggested, > except one. As already noted several times, following this suggestion > would have added a bug. Alvaro preferred my original approach here in > any case. I refer to my original approach of making the new > UNIQUE_CHECK_SPECULATIVE case minimally different from the existing > UNIQUE_CHECK_PARTIAL case currently used for deferred unique > constraints and speculative insertion, as opposed to making the new > UNIQUE_CHECK_SPECULATIVE "like CHECK_UNIQUE_YES, but return FALSE > instead of throwing an error on conflict". That was broken because > CHECK_UNIQUE_YES waits for the outcome of an xact, which > UNIQUE_CHECK_PARTIAL never does, and so UNIQUE_CHECK_SPECULATIVE must > never do. > > Any and all waits happen in the first phase of speculative insertion, > and never the seconds. I could give a complicated explanation for why, > involving a deadlock scenario, but a simple explanation will do: it > has always worked that way, and was tested to work that way. > > Feedback from Heikki led to these changes for this revision: > > * The use of arguments within ExecInsert() was simplified. > > * More concise AM documentation. > > The docs essentially describe two new concepts: > > - What unique index insertion needs to know about speculative > insertion in general. This doesn't just apply to speculative inserters > themselves, of course. > > - What speculative insertion is. Why it exists (why we don't just wait > on xact). In other words, what "unprincipled deadlocks" are, and how > they are avoided (from a relatively high level). > > I feel like I have a responsibility to make sure that this mechanism > is well documented, especially given that not that many people were > involved in its design. It's possible that no more than the 3 original > authors of UPSERT fully understand speculative insertion -- it's easy > to miss some of the subtleties. Thanks for working on this, and sorry for disappearing and dropping this on the floor earlier. This doesn't apply anymore, thanks to 9c810a2e. Shouldn't be hard to fix. I was in support of this earlier in general, even though I had some questions on the details. But now that I look at the patch again, I'm not so sure anymore. I don't think this actually makes things clearer. It adds new cases that the code needs to deal with. Index AM writers now have to care about the difference between a UNIQUE_CHECK_PARTIAL and UNIQUE_CHECK_SPECULATIVE. You can have the exact same implementation for both, but at the very least the index AM author needs to read the paragraph you added to the docs to understand the difference. That adds some cognitive load. Likewise, in ExecInsertIndexTuples(), this makes the deferred-index case work slightly differently from speculative insertion. It's not a big difference, but it again forces you to keep one more scenario in mind, when reading the code. So overall, I think we should just drop this. Maybe a comment somewhere would be in order, to point out that ExecInsertIndexTuples() and index_insert might perform some unnecessary work, by inserting index tuples for a doomed heap tuple, if a speculative insertion fails. But that's all. - Heikki
On Tue, Sep 20, 2016 at 8:55 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > Thanks for working on this, and sorry for disappearing and dropping this on > the floor earlier. This doesn't apply anymore, thanks to 9c810a2e. Shouldn't > be hard to fix. Thanks for looking at it again. > I was in support of this earlier in general, even though I had some > questions on the details. But now that I look at the patch again, I'm not so > sure anymore. Honestly, I almost withdrew this patch myself, simply because it has dragged on so long. It has become ridiculous. > I don't think this actually makes things clearer. It adds new > cases that the code needs to deal with. Index AM writers now have to care > about the difference between a UNIQUE_CHECK_PARTIAL and > UNIQUE_CHECK_SPECULATIVE. You can have the exact same implementation for > both, but at the very least the index AM author needs to read the paragraph > you added to the docs to understand the difference. That adds some cognitive > load. I think it gives us the opportunity to discuss the differences, and in particular to explain why this "speculative insertion" thing exists at all. Besides, we can imagine an amcanunique implementation in which the difference might matter. Honestly, since it is highly unlikely that there ever will be another amcanunique access method, the cognitive burden to implementers of new amcanunique AMs is not a concern to me. Rather, the concern with that part of the patch is educating people about how the whole speculative insertion thing works in general, and drawing specific attention to it in a few key places in the code. Speculative insertion is subtle and complicated enough that I feel that that should be well described, as I've said many times. Remember how hard it was for us to come to agreement on the basic requirements in the first place! > Likewise, in ExecInsertIndexTuples(), this makes the deferred-index > case work slightly differently from speculative insertion. It's not a big > difference, but it again forces you to keep one more scenario in mind, when > reading the code This actually does have useful practical consequences, although I didn't point that out earlier (though I should have). To see what I mean, consider the complaint in this recent thread, which is based on an actual user application problem: https://www.postgresql.org/message-id/flat/1472134358.649524557@f146.i.mail.ru#1472134358.649524557@f146.i.mail.ru I go on to explain how this patch represents a partial solution to that [1]. That's what I mean by "useful practical consequences". As I say in [1], I think we could even get a full solution, if we applied this patch and *also* made the ordering in which the relcache returns a list of index OIDs more useful (it should still be based on something stable, to avoid deadlocks, but more than just OID order. Instead, relcache.c can sort indexes such that we insert into primary keys first, then unique indexes, then all other indexes. This also avoids bloat if there is a unique violation, by getting unique indexes out of the way first during ExecInsert()). > So overall, I think we should just drop this. Maybe a comment somewhere > would be in order, to point out that ExecInsertIndexTuples() and > index_insert might perform some unnecessary work, by inserting index tuples > for a doomed heap tuple, if a speculative insertion fails. But that's all. Perhaps. I'm curious about what your thoughts are on what I've said about "useful practical consequences" of finishing insertion earlier for speculative inserters. If you're still not convinced about this patch, having considered that as well, then I will drop the patch (or maybe we just add some comments, as you suggest). [1] https://www.postgresql.org/message-id/CAM3SWZTFTbK_Y%3D7uWJaXYLHnYQ99pV4KFpmjTKbNJR5_%2BQThzA%40mail.gmail.com -- Peter Geoghegan
On 09/24/2016 02:34 PM, Peter Geoghegan wrote: > On Tue, Sep 20, 2016 at 8:55 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> Likewise, in ExecInsertIndexTuples(), this makes the deferred-index >> case work slightly differently from speculative insertion. It's not a big >> difference, but it again forces you to keep one more scenario in mind, when >> reading the code > > This actually does have useful practical consequences, although I > didn't point that out earlier (though I should have). To see what I > mean, consider the complaint in this recent thread, which is based on > an actual user application problem: > > https://www.postgresql.org/message-id/flat/1472134358.649524557@f146.i.mail.ru#1472134358.649524557@f146.i.mail.ru > > I go on to explain how this patch represents a partial solution to > that [1]. That's what I mean by "useful practical consequences". As I > say in [1], I think we could even get a full solution, if we applied > this patch and *also* made the ordering in which the relcache returns > a list of index OIDs more useful (it should still be based on > something stable, to avoid deadlocks, but more than just OID order. > Instead, relcache.c can sort indexes such that we insert into primary > keys first, then unique indexes, then all other indexes. This also > avoids bloat if there is a unique violation, by getting unique indexes > out of the way first during ExecInsert()). Hmm, yeah, that'd be nice to fix. I'd like to see a patch specifically to fix that. I'm not convinced that we need all the complications of this patch, to get that fixed. (In particular, indexam's still wouldn't need to care about the different between CHECK_UNIQUE_PARTIAL and CHECK_UNIQUE_SPECULATIVE, I think.) - Heikki
On Tue, Sep 27, 2016 at 9:10 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > Hmm, yeah, that'd be nice to fix. I'd like to see a patch specifically to > fix that. I'm not convinced that we need all the complications of this > patch, to get that fixed. (In particular, indexam's still wouldn't need to > care about the different between CHECK_UNIQUE_PARTIAL and > CHECK_UNIQUE_SPECULATIVE, I think.) But you are convinced that everything else I describe would do it? -- Peter Geoghegan
On 09/27/2016 11:22 AM, Peter Geoghegan wrote: > On Tue, Sep 27, 2016 at 9:10 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> Hmm, yeah, that'd be nice to fix. I'd like to see a patch specifically to >> fix that. I'm not convinced that we need all the complications of this >> patch, to get that fixed. (In particular, indexam's still wouldn't need to >> care about the different between CHECK_UNIQUE_PARTIAL and >> CHECK_UNIQUE_SPECULATIVE, I think.) > > But you are convinced that everything else I describe would do it? I didn't think very hard about it, but yeah, think so.. - Heikki
On Tue, Sep 27, 2016 at 9:25 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > I didn't think very hard about it, but yeah, think so.. Do you think it's worth a backpatch? Or, too early to tell? -- Peter Geoghegan
On 09/27/2016 11:38 AM, Peter Geoghegan wrote: > On Tue, Sep 27, 2016 at 9:25 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> I didn't think very hard about it, but yeah, think so.. > > Do you think it's worth a backpatch? Or, too early to tell? Too early to tell.. - Heikki
Heikki Linnakangas <hlinnaka@iki.fi> writes: > On 09/24/2016 02:34 PM, Peter Geoghegan wrote: >> I go on to explain how this patch represents a partial solution to >> that [1]. That's what I mean by "useful practical consequences". As I >> say in [1], I think we could even get a full solution, if we applied >> this patch and *also* made the ordering in which the relcache returns >> a list of index OIDs more useful (it should still be based on >> something stable, to avoid deadlocks, but more than just OID order. >> Instead, relcache.c can sort indexes such that we insert into primary >> keys first, then unique indexes, then all other indexes. This also >> avoids bloat if there is a unique violation, by getting unique indexes >> out of the way first during ExecInsert()). > Hmm, yeah, that'd be nice to fix. I'd like to see a patch specifically > to fix that. I'm not convinced that we need all the complications of > this patch, to get that fixed. (In particular, indexam's still wouldn't > need to care about the different between CHECK_UNIQUE_PARTIAL and > CHECK_UNIQUE_SPECULATIVE, I think.) I can see the value of processing unique indexes before non-unique ones. I'm pretty suspicious of trying to prioritize primary keys first, though, because (a) it's not clear why bother, and (b) PG is a tad squishy about whether an index is a primary key or not, so that I'd be worried about different backends sometimes choosing different orders. I'd simplify this to "uniques in OID order then non-uniques in OID order". regards, tom lane
On Tue, Sep 27, 2016 at 2:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I can see the value of processing unique indexes before non-unique ones. > I'm pretty suspicious of trying to prioritize primary keys first, though, > because (a) it's not clear why bother, and (b) PG is a tad squishy about > whether an index is a primary key or not, so that I'd be worried about > different backends sometimes choosing different orders. I'd simplify > this to "uniques in OID order then non-uniques in OID order". I see your point. A more considered ordering of indexes for processing by the executor (prepared for it by the relcache), including something more that goes further than your proposal is useful in the context of fixing the bug I mentioned [1], but for non-obvious reasons. I would like to clarify what I meant there specifically. I am repeating myself, but maybe I just wasn't clear before. The theory of putting the PK first there is that we then have a well-defined (uh, better defined) user-visible ordering *across unique indexes* such that the problem case would *reliably* be fixed. With only this refactoring patch applied (and no change to the relcache ordering thing), it is then only a sheer accident of OID assignment ordering that the INSERT ON CONFLICT problem case happens to take the alternative path on the OP's *inferred* index (which, as it happens, was the PK for him), rather than the other unique index that was involved (the one that is not actually inferred, and yet is equivalent to the PK, UPSERT-semantics-wise). So, the reporter of the bug [1] is happy with his exact case now working, at least. You might now counter: "But why prefer one convention over the other? Prioritizing the PK would reliably fix that particular problem case, but that's still pretty arbitrary." It's true that it's somewhat arbitrary to always (speculatively) insert into the PK first. But, I think that it's more likely that the PK is inferred in general, and so it's more likely that users will fall on the right side of that in practice. Besides, at least we now have a consistent behavior. You might also reasonably ask: "But what if there are multiple unique indexes, none of which happen to be the PK? Isn't that subject to the same vagaries of OID ordering anyway?" I must admit that it is. But I don't really know where to draw the line here. Is it worth contemplating a more complicated scheme still? For example, trigger-style ordering; a sort order that considers index name as a "secondary attribute", in order to ensure perfectly consistent behavior? I must admit that I don't really have a clue whether or not that's a good idea. It's an idea. [1] https://www.postgresql.org/message-id/CAM3SWZTFTbK_Y%3D7uWJaXYLHnYQ99pV4KFpmjTKbNJR5_%2BQThzA%40mail.gmail.com -- Peter Geoghegan