Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} |
Date | |
Msg-id | CAM3SWZT-PVVB437Jmq0Y=UUhiMiZDFpHx-oHHcQVepjDo6Sn7A@mail.gmail.com Whole thread Raw |
In response to | Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} (Simon Riggs <simon@2ndQuadrant.com>) |
Responses |
Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}
Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} |
List | pgsql-hackers |
On Mon, Oct 27, 2014 at 4:34 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > Let's see if we can link these two thoughts. > > 1. You think the biggest problem is the lack of attention to the design. > > 2. I keep asking you to put the docs in a readable form. > > If you can't understand the link between those two things, I am at a loss. You've read the docs. Please be clearer. In what sense are they not readable? The main description of the feature appears on the INSERT reference page: http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html These paragraphs were added (there is more on the INSERT reference page than mentioned here, but this is the main part): """ The optional ON CONFLICT clause specifies a path to take as an alternative to raising a uniqueness violation error. ON CONFLICT IGNORE simply avoids inserting any individual row when it is determined that a uniqueness violation error would otherwise need to be raised. ON CONFLICT UPDATE has the system take an UPDATE path in respect of such rows instead. ON CONFLICT UPDATE guarantees an atomic INSERT or UPDATE outcome. While rows may be updated, the top-level statement is still an INSERT, which is significant for the purposes of statement-level triggers and the rules system. Note that in the event of an ON CONFLICT path being taken, RETURNING returns no value in respect of any not-inserted rows. ON CONFLICT UPDATE optionally accepts a WHERE clause condition. When provided, the statement only proceeds with updating if the condition is satisfied. Otherwise, unlike a conventional UPDATE, the row is still locked for update. Note that the condition is evaluated last, after a conflict has been identified as a candidate to update. ON CONFLICT UPDATE accepts EXCLUDED expressions in both its targetlist and WHERE clause. This allows expressions (in particular, assignments) to reference rows originally proposed for insertion. Note that the effects of all per-row BEFORE INSERT triggers are carried forward. This is particularly useful for multi-insert ON CONFLICT UPDATE statements; when inserting or updating multiple rows, constants need only appear once. There are several restrictions on the ON CONFLICT UPDATE clause that do not apply to UPDATE statements. Subqueries may not appear in either the UPDATE targetlist, nor its WHERE clause (although simple multi-assignment expressions are supported). WHERE CURRENT OF cannot be used. In general, only columns in the target table, and excluded values originally proposed for insertion may be referenced. Operators and functions may be used freely, though. INSERT with an ON CONFLICT UPDATE clause is a deterministic statement. You cannot UPDATE any single row more than once. Writing such an INSERT statement will raise a cardinality violation error. Rows proposed for insertion should not duplicate each other in terms of attributes constrained by the conflict-arbitrating unique index. Note that the ordinary rules for unique indexes with regard to NULL apply analogously to whether or not an arbitrating unique index indicates if the alternative path should be taken, so multiple NULL values across a set of rows proposed for insertion are acceptable; the statement will always insert (assuming there is no unrelated error). Note that merely locking a row (by having it not satisfy the WHERE clause condition) does not count towards whether or not the row has been affected multiple times (and whether or not a cardinality violation error is raised). ON CONFLICT UPDATE requires a column specification, which is used to infer an index to limit pre-checking for duplicates to. ON CONFLICT IGNORE makes this optional. Failure to anticipate and prevent would-be uniqueness violations originating in some other unique index than the single unique index that was anticipated as the sole source of would-be uniqueness violations can result in failing to insert a row (taking the IGNORE path) when a uniqueness violation may have actually been appropriate; omitting the specification indicates a total indifference to where any would-be uniqueness violation could occur. Note that the ON CONFLICT UPDATE assignment may result in a uniqueness violation, just as with a conventional UPDATE. The rules for unique index inference are straightforward. Columns and/or expressions specified must match all the columns/expressions of some existing unique index on table_name. The order of the columns/expressions in the index definition, or whether or not the index definition specified NULLS FIRST or NULLS LAST, or the internal sort order of each column (whether DESC or ASC were specified) are all irrelevant. However, partial unique indexes are not supported as arbiters of whether an alternative ON CONFLICT path should be taken. Deferred unique constraints are also not accepted. You must have INSERT privilege on a table in order to insert into it, as well as UPDATE privilege if and only if ON CONFLICT UPDATE is specified. If a column list is specified, you only need INSERT privilege on the listed columns. Similarly, when ON CONFLICT UPDATE is specified, you only need UPDATE privilege on the column(s) that are listed to be updated, as well as SELECT privilege on any column whose values are read in the ON CONFLICT UPDATE expressions or condition. Use of the RETURNING clause requires SELECT privilege on all columns mentioned in RETURNING. If you use the query clause to insert rows from a query, you of course need to have SELECT privilege on any table or column used in the query. """ On the same page, there is commentary on each new addition to the grammar (e.g. expression_index, column_name_index, expression). There are some worked examples, too (there are now 3 ON CONFLICT related examples out of 12 total): """ Insert or update new distributors as appropriate. Assumes a unique index has been defined that constrains values appearing in the did column. Note that an EXCLUDED expression is used to reference values originally proposed for insertion: INSERT INTO distributors (did, dname) VALUES (5, 'Gizmo transglobal'), (6, 'Doohickey, inc') ON CONFLICT (did) UPDATE SETdname = EXCLUDED(dname) || ' (formerly ' || dname || ')' Insert a distributor, or do nothing for rows proposed for insertion when an existing, excluded row (a row with a matching constrained column or columns after before row insert triggers fire) exists. Assumes a unique index has been defined that constrains values appearing in the did column (although since the IGNORE variant was used, the specification of columns to infer a unique index from is not mandatory): INSERT INTO distributors (did, dname) VALUES (7, 'Doodad GmbH') ON CONFLICT (did) IGNORE Insert or update new distributors as appropriate. Assumes a unique index has been defined that constrains values appearing in the did column. WHERE clause is used to limit the rows actually updated (any existing row not updated will still be locked, though): -- Don't update any existing row if it was already renamed at some -- earlier stage INSERT INTO distributors (did, dname)VALUES (8, 'Thingamabob Distribution') ON CONFLICT (did) UPDATE SET dname = EXCLUDED(dname) || ' (formerly ' || dname|| ')' WHERE dname NOT LIKE '%(formerly %)' """ If you do not provide actionable feedback, then I cannot do anything. That's frustrating for me. Saying that the docs are not in a "readable form" is too vague for me to act on. If I'm not mistaken, they're in the same form (the same medium - sgml doc patches) that they've been for every patch ever submitted to PostgreSQL (in terms of being easy to read against the backdrop of existing docs, which you mentioned before). I have built the html docs and uploaded them myself, as an additional convenience to reviewers. What standard am I failing to meet here? I spent *hours* editing the Wiki yesterday to make the structure clearer (little new content was added), and make the information exactly current. I am bending over backwards to make the user-visible behaviors clearer, and to document the patch well, both from a user-visible perspective, and an internal perspective. I have internal docs for both implementations of value locking (#1 and #2). Then there's the value locking Wiki page. I have produced *reams* of documentation at this point, all in an effort to make things easier to understand. The user visible doc patch diff stats: ---doc/src/sgml/ddl.sgml | 23 +++doc/src/sgml/indices.sgml | 11 +-doc/src/sgml/mvcc.sgml | 43 ++++--doc/src/sgml/plpgsql.sgml | 20 ++-doc/src/sgml/postgres-fdw.sgml | 8 ++doc/src/sgml/ref/create_index.sgml | 7 +-doc/src/sgml/ref/create_rule.sgml | 6 +-doc/src/sgml/ref/create_table.sgml | 5 +-doc/src/sgml/ref/create_trigger.sgml | 5 +-doc/src/sgml/ref/create_view.sgml | 36 ++++-doc/src/sgml/ref/insert.sgml | 258 ++++++++++++++++++++++++++++++++--doc/src/sgml/ref/set_constraints.sgml| 6 +-doc/src/sgml/trigger.sgml | 46++++--13 files changed, 426 insertions(+), 48 deletions(-) #1 internal documentation stats: doc/src/sgml/indexam.sgml | 133 ++++++++++++++++++++++++++++++++++++---src/backend/access/nbtree/README | 90 +++++++++++++++++++++++++-src/backend/executor/README | 35 +++++++++++3 files changed, 247 insertions(+), 11 deletions(-) #2 internal documentation stats: ---src/backend/executor/README | 49 +++++++++++++++++++++++++++++++++++++++++++++1 file changed, 49 insertions(+) Maybe those INSERT reference paragraphs could use another pass of copy-editing, but that's not what you said - you suggested some far more significant issue. Some other issues, like exactly how triggers work with the feature are discussed on dedicated pages rather than the main INSERT reference page, which seems consistent with the existing documentation. Have I not provided a total of 4 isolation tests illustrating interesting concurrency/visibility interactions? That's a lot of isolation tests. Here is the tests commit stat: 31 files changed, 1159 insertions(+), 8 deletions(-) I really don't have any idea what you'd like me to do. Once again: please be more specific. I don't know what you mean. -- Peter Geoghegan
pgsql-hackers by date: