Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0
Date
Msg-id 20150202010613.GA25543@awork2.anarazel.de
Whole thread Raw
In response to Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
Hi,

A first (not actually that quick :() look through the patches to see
what actually happened in the last months. I didn't keep up with the
thread.

Generally the split into the individual commits doesn't seem to make
much sense to me. The commits individually (except the first) aren't
indivdiually commitable and aren't even meaningful. Splitting off the
internal docs, tests and such actually just seems to make reviewing
harder because you miss context. Splitting it so that individual piece
are committable and reviewable makes sense, but... I have no problem
doing the user docs later. If you split of RLS support, you need to
throw an error before it's implemented.

0001:
* References INSERT with ON CONFLICT UPDATE, can thus not be committed independently. I don't think those references
reallyare needed.
 
* I'm not a fan of the increased code duplication in ExecCheckRTEPerms(). Can you look into cleaning that up?
* Also the comments inside the ACL_INSERT case still reference UPDATE.

Other than that I think we can just go ahead and commit this ahead of
time. Mentioning ON CONFLICT UPDATE (OCU henceforth) in the commit
message only.

0007:
* "AMs" alone isn't particularly unique.
* Without the context of the discussion "unprincipled deadlocks" aren't well defined.
* Too many "" words.
* Waiting "too long" isn't defined. Neither is why that'd imply unprincipled deadlocks. Somewhat understandable with
thecontext of the discussion, but surely not a couple years down the road.
 
* As is I don't find the README entry super helpful. It should state what the reason for doing this is cleary, start at
thehigher level, and then move to the details.
 
* Misses details about the speculative heavyweight locking of tuples.

0002:
* Tentatively I'd say that killspeculative should be done via a separate function instead of heap_delete()
* I think we should, as you ponder in a comment, do the OCU specific stuff lazily and/or in a separate function from
BuildIndexInfo().That function is already quite visible in profiles, and the additional work isn't entirely trivial.
 
* I doubt logical decoding works with the patch as it stands.
* The added ereport (i.e. user facing) error message in ExecInsertIndexTuples won't help a user at all.
* Personally I don't care one iota for comments like "Get information from the result relation info structure.". Yes,
oneof these already exists, but ...
 
* If a arbiter index is passed to ExecCheckIndexConstraints(), can't we abort the loop after checking it? Also, do we
reallyhave to iterate over indexes for that case? How about moving the loop contents to a separate function and using
thatseparately for the arbiter cases?
 
* Don't like the comment above check_exclusion_or_unique_constraint()'s much. Makes too much of a special case of OCU
* ItemPointerIsValid
* ExecCheckHeapTupleVisible's comment "It is not acceptable to proceed " sounds like you're talking with a child or so
;)
* ExecCheckHeapTupleVisible()'s errhint() sounds like an argument/excuse (actually like a code comment). That's not
goingto help a user at all.
 
* I find the modified control flow in ExecInsert() pretty darn ugly. I think this needs to be cleaned up. The
speculativecase should imo be a separate function or something.
 
* /*  * This may occur when an instantaneously invisible tuple is blamed  * as a conflict because multiple rows are
insertedwith the same  * constrained values.  How can this happen? We don't insert multiple rows with the same  command
id?
* ExecLockUpdatedTuple() has (too?) many comments, but little overview of what/why it is doing what it does on a higher
level.

* plan_speculative_use_index: "Use the planner to decide speculative insertion arbiter index" - Huh? " rel is the
targetto undergo ON CONFLICT UPDATE/IGNORE." - Which rel?
 
* formulations as "fundamental nexus" are hard to understand imo.
* Perhaps it has previously been discussed but I'm not convinced by the reasoning for not looking at opclasses in
infer_unique_index().This seems like it'd prohibit ever having e.g. case insensitive opclasses - something surely
worthwile.
* Doesn't infer_unique_index() have to look for indisvalid? This isn't going to work well with a invalid (not to speak
fora !ready) index.
 
* Is ->relation in the UpdateStmt generated in transformInsertStmt ever used for anything? If so, it'd possibly
generatesome possible nastyness due to repeated name lookups. Looks like it'll be used in transformUpdateStmt
 
* What's the reason for the !pstate->p_parent? Also why the parens?pstate->p_is_speculative = (pstate->parentParseState
&&                           (!pstate->p_parent_cte &&
pstate->parentParseState->p_is_insert&&                             pstate->parentParseState->p_is_speculative));
 
* Why did you need to make %nonassoc   DISTINCT and ON nonassoc in the grammar?
* The whole speculative insert logic isn't really well documented. Why, for example, do we actually need the token? And
whyare there no issues with overflow? And where is it documented that a 0 means there's no token? ...
 
* Isn't "SpecType" a awfully generic (and nondescriptive) name?
* /* XXX:  Make sure that re-use of bits is safe here */ - no, not unless you change existing checks.
*     /* * Immediately VACUUM "super-deleted" tuples */if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
               InvalidTransactionId))    return HEAPTUPLE_DEAD; Is that branch really needed? Shouldn't it just be
happeningas a consequence of the already existing code? Same in SatisfiesMVCC. If you actually needed that block, it'd
needto be done in SatisfiesSelf as well, no? You have a comment about a possible loop - but that seems wrong to me,
implyingthat HEAP_XMIN_COMMITTED was set invalidly.
 


Ok, I can't focus at all any further at this point. But there's enough
comments here that some even might make sense ;)

Greetings,

Andres Freund

--Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: Small doc patch about pg_service.conf
Next
From: Michael Paquier
Date:
Subject: Some dead code in metaphone() of fuzzystrmatch.c