Re: Refactoring speculative insertion with unique indexes a little - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Refactoring speculative insertion with unique indexes a little
Date
Msg-id CAM3SWZTOBedtuUJuhm2TKEGEzBRBkhQKJzstbvd8ibfeA23bQQ@mail.gmail.com
Whole thread Raw
In response to Re: Refactoring speculative insertion with unique indexes a little  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: Refactoring speculative insertion with unique indexes a little
Re: Refactoring speculative insertion with unique indexes a little
List pgsql-hackers
On Tue, Jun 30, 2015 at 11:19 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> I agree it would be cleaner to have a separate CHECK_UNIQUE_XXX code for
> speculative insertions. You've defined CHECK_UNIQUE_SPECULATIVE as "like
> CHECK_UNIQUE_PARTIAL, but you don't have to insert the index tuple if
> there's a conflict". I think it'd be better to define it as "like
> CHECK_UNIQUE_YES, but return FALSE instead of throwing an error on
> conflict". The difference is that the aminsert would not be allowed to
> return FALSE when there is no conflict.

Suppose we do it that way. Then what's the difference between
CHECK_UNIQUE_SPECULATIVE and CHECK_UNIQUE_PARTIAL? You've just
effectively required the CHECK_UNIQUE_YES case to not physically
insert a physical tuple before throwing an error, which does not seem
essential to the existing definition of CHECK_UNIQUE_YES -- you've
redefined CHECK_UNIQUE_YES in a way that nbtree happens to meet at the
moment. If we had an amcanunique AM that worked a bit like exclusion
constraints, this new obligation for CHECK_UNIQUE_YES might make it
impossible for that to work.

I'm not necessarily in disagreement. I just didn't do it that way for
that reason.

> Actually, even without this patch, a dummy implementation of aminsert that
> always returns FALSE in CHECK_UNIQUE_PARTIAL mode would be legal per the
> docs, but would loop forever. So that would be nice to fix or document away,
> even though it's not a problem for B-tree currently.

UNIQUE_CHECK_PARTIAL calls to an aminsert function are allowed to
report false positives (of not being unique, by returning FALSE as you
say), where there may not have been a conflict, but it's not clear
(e.g. because the conflicting xact has yet to commit/abort). You still
need to insert, though, so that the recheck at end-of-xact works for
deferred constraints. At that point, it's really not too different to
ordinary unique enforcement, except that the other guy is guaranteed
to notice you, too.

You can construct a theoretical case where lock starvation occurs with
unique constraint enforcement. I think it helps with nbtree here that
someone will reliably *not* see a conflict when concurrently
inserting, because unique index "value locking" exists (by locking the
first leaf page a value could be on with a buffer lock). But even if
that wasn't the case, the insert + recheck thing would be safe, just
as with exclusion constraints...provided you insert to begin with,
that is.

>> In passing, I have make ExecInsertIndexTuples() give up when a
>> speculative insertion conflict is detected. Again, this is not about
>> bloat prevention; it's about making the code easier to understand by
>> not doing something that is unnecessary, and in avoiding that also
>> avoiding the implication that it is necessary. There are already
>> enough complicated interactions that *are* necessary (e.g. "livelock
>> avoidance" for exclusion constraints). Let us make our intent clearer.
>
> Hmm, not sure I like those changes. ExecInsertIndexTuples's behaviour now
> depends on whether specConflict is passed, but that seems weird as
> specConflict is an output parameter.

Your statement is ambiguous and confusing to me. Are you objecting to
the idea of returning from ExecInsertIndexTuples() "early" in general,
or to one narrow aspect of how that was proposed -- the fact that it
occurs due to "specConflict != NULL" rather than "noDupErr"? Obviously
if it's just the latter, then that's a small adjustment. But AFAICT
your remark about specConflict could one detail of a broader complaint
about an idea that you dislike generally.

>> The patch also updates the AM interface documentation (the part
>> covering unique indexes). It seems quite natural to me to document the
>> theory of operation for speculative insertion there.
>
>
> Yeah, although I find the explanation pretty long-winded and difficult to
> understand ;-).

I don't know what you mean. You say things like this to me fairly
regularly, but I'm always left wondering what the take-away should be.
Please be more specific.

Long-winded generally means that more words than necessary were used.
I think that the documentation proposed is actually very information
dense, if anything. As always, I aimed to keep it consistent with the
surrounding documentation.

I understand that the material might be a little hard to follow, but
it concerns how someone can externally implement a Postgres index
access method that is "amcanunique". That's a pretty esoteric subject
-- so far, we've had exactly zero takers (even without the
"amcanunique" part). It's damn hard to make these ideas accessible.

I feel that I have an obligation to share information about (say) how
the speculative insertion mechanism works -- the "theory of operation"
seems very useful. Like any one of us, I might not be around to
provide input on something like that in the future, so it makes sense
to be thorough and unambiguous. There are very few people (3?) who
have a good sense of how it works currently, and there are some very
subtle aspects to it that I tried to point out.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Raising our compiler requirements for 9.6
Next
From: Peter Geoghegan
Date:
Subject: Re: Refactoring speculative insertion with unique indexes a little