Thread: INSERT ... ON CONFLICT {UPDATE | IGNORE}

INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

28 August 2014, 02:44:06

Attached WIP patch extends the INSERT statement, adding a new ON
CONFLICT {UPDATE | IGNORE} clause. This allows INSERT statements to
perform UPSERT operations (if you want a more formal definition of
UPSERT, I refer you to my pgCon talk's slides [1], or the thread in
which I delineated the differences between SQL MERGE and UPSERT [2]).
The patch builds on previous work in this area, and incorporates
feedback from Kevin and Andres.

Overview
=======

Example usage:

INSERT INTO upsert(key, val) VALUES(1, 'insert') ON CONFLICT UPDATE
SET val = 'update';

Essentially, the implementation has all stages of query processing
track some auxiliary UPDATE state. So, for example, during parse
analysis, UPDATE transformation occurs in an ad-hoc fashion tightly
driven by the parent INSERT, but using the existing infrastructure
(i.e. transformStmt()/transformUpdateStmt() is called, and is
insulated from having to care about the feature as a special case).
There are some restrictions on what this auxiliary update may do, but
FWIW there are considerably fewer than those that the equivalent MySQL
or SQLite feature imposes on their users. All of the following SQL
queries are valid with the patch applied:

-- Nesting within wCTE:
WITH t AS (
    INSERT INTO z SELECT i, 'insert'
    FROM generate_series(0, 16) i
    ON CONFLICT UPDATE SET v = v || 'update' -- use of
operators/functions in targetlist
    RETURNING * -- only projects inserted tuples, never updated tuples
)
SELECT * FROM t JOIN y ON t.k = y.a ORDER BY a, k;

-- IGNORE variant:
INSERT INTO upsert(key, val) VALUES(1, 'insert') ON CONFLICT IGNORE;

-- predicate within UPDATE auxiliary statement (row is still locked
when the UPDATE predicate isn't satisfied):
INSERT INTO upsert(key, val) VALUES(1, 'insert') ON CONFLICT UPDATE
WHERE val != 'delete';

As with SQL MERGE (at least as implemented in other systems),
subqueries may not appear within the UPDATE's targetlist, nor may they
appear within the special WHERE clause. But the "INSERT part" of the
query has no additional limitations, so you may for example put
subqueries within a VALUES() clause, or INSERT...SELECT...ON CONFLICT
UPDATE... just as you'd expect. INSERT has been augmented with a new
clause, but that clause does not unreasonably fail to play nice with
any other aspect of insertion. (Actually, that isn't quite true, since
at least for now table inheritance, updatable views and foreign tables
are unsupported. This can be revisited.)

I think that without initially realizing it, I copied the SQLite
syntax [3]. However, unlike with that SQLite feature, CONFLICT only
refers to a would-be duplicate violation, and not a violation of any
other kind of constraint.

How this new approach works (Executor and Optimizer stuff)
============================================

During the execution of the parent ModifyTable, a special auxiliary
subquery (the UPDATE ModifyTable) is considered as a special case.
This is not a subplan of the ModifyTable node in the conventional
sense, and so does not appear within EXPLAIN output. However, it is
more or less independently planned, and entirely driven by the INSERT
ModifyTable. ExecModifyTable() is never called with this special
auxiliary plan state passed directly. Rather, its parent manages the
process as the need arises. ExecLockUpdateTuple() locks and
potentially updates tuples, using the EvalPlanQual() mechanism (even
at higher isolation levels, with appropriate precautions).

The per-tuple expression context of the auxiliary query/plan is used
with EvalPlanQual() from within ExecLockUpdateTuple() (the new routine
tasked with locking and updating on conflict). There is a new
ExecUpdate() call site within ExecLockUpdateTuple(). Given the
restrictions necessarily imposed on this pseudo-rescanning
(principally the outright rejection of anything that necessitates
PARAM_EXEC parameters during planning), this is safe, as far as I'm
aware. It is convenient to be able to re-use infrastructure in such a
way as to more or less handle the UPDATE independently, driven by the
INSERT, except for execution which is more directly handled by the
INSERT (i.e. there is no ExecModifyTable() call in respect of this new
auxiliary ModifyTable plan). Granted, it is kind of bizarre that the
auxiliary query may have a more complex plan than is necessary for our
purposes, but it doesn't actually appear to be a problem when
"rescanning" (Like a SELECT FOR UPDATE/FOR SHARE's node, we call
EvalPlanQualSetTuple() directly). It is likely worthwhile to teach the
optimizer that we really don't care about how the one and only base
rel within the UPDATE auxiliary subquery (the target table) is
scanned, if only to save a few cycles. I have (temporarily) hacked the
optimizer to prevent index-only scans, which are problematic here, by
adding disable_cost when a query parse tree that uses the feature is
seen. Although what I've done is a temporary kludge, the basic idea of
forcing a particular type of relation scan has a precedent: UPDATE
WHERE CURRENT OF artificially forces a TID scan, because only a TID
scan will work correctly there. I couldn't come up with a convenient
way to artificially inject disable_cost into alternative scan types,
in the less invasive style of isCurrentOf, because there is no
convenient qual to target within cost_qual_eval().

As in previous incarnations, we lock each tuple (although, of course,
only with the UPDATE variant). We may or may not also actually proceed
with the update, depending on whether or not the user-specified
special update predicate (if any) is satisfied. But if we do,
EvalPlanQual() is (once the tuple is locked) only ever evaluated on a
conclusively committed and locked-by-us conflict tuple as part of the
process of updating, even though it's possible for the UPDATE
predicate to be satisfied where conceivably it would not be satisfied
by the tuple version actually visible to the command's MVCC snapshot.
I think this is the correct behavior. We all seem to be in agreement
that we should update at READ COMMITTED if *no* version of the tuple
is visible. It seems utterly arbitrary to me to suggest that on the
one hand it's okay to introduce one particular "MVCC violation", but
not another equivalent one. The first scenario is one in which we
update despite our update's (or rather insert's) "predicate" not being
satisfied (according to our MVCC snapshot). The second scenario is one
in which the same "predicate" is also not satisfied according to our
MVCC snapshot, but in a slightly different way. Why bother introducing
a complicated distinction, if it's a distinction without a difference?
I'd rather have a behavior that is consistent, easy to reason about,
and easy to explain. And so, the predicate is considered once, after
conclusively locking a conflict tuple.

It feels natural and appropriate to me that if the special UPDATE qual
isn't satisfied, we still lock the tuple. After all, in order to make
a conclusive determination about the qual not being satisfied, we need
to lock the tuple. This happens to insulate ExecUpdate() from having
to care about "invisible tuples", which are now possible (although we
still throw an error, just with a useful error message that phrases
the problem in reference to this new feature).

Of course, at higher isolation levels serialization errors are thrown
when something inconsistent with the higher level's guarantees would
otherwise need to occur (even for the IGNORE variant). Still,
interactions with SSI, and preserving the guarantees of SSI should
probably be closely considered by a subject matter expert.

Omission
=======

The patch currently lacks a way of referencing datums rejected for
insertion when updating. The way MySQL handles the issue seems
questionable. They allow you to do something like this:

INSERT INTO upsert (key, val) VALUES (1 'val') ON DUPLICATE KEY UPDATE
val = VALUES(val);

The implication is that the updated value comes from the INSERT's
VALUES() list, but emulating that seems like a bad idea. In general,
at least with Postgres it's entirely possible that values rejected
differ from the values appearing in the VALUES() list, due to the
effects of before triggers. I'm not sure whether or not we should
assume equivalent transformations during any UPDATE before triggers.

This is an open item. I think it makes sense to deal with it a bit later.

"Value locking"
===========

To date, on-list discussion around UPSERT has almost exclusively
concerned what I've called "value locking"; the idea of locking values
in unique indexes in the abstract (to establish the right to insert
ahead of time). There was some useful discussion on this question
between myself and Heikki back around December/January. Ultimately, we
were unable to reach agreement on an approach and discussion tapered
off. However, Heikki did understand the concerns that informed by
design. He recognized the need to be able to easily *release* value
locks, so as to avoid "unprincipled deadlocks", where under high
concurrency there are deadlocks between sessions that only UPSERT a
single row at a time. I'm not sure how widely appreciated this point
is, but I believe that Heikki appreciates it. It is a very important
point in my opinion. I don't want an implementation that is in any way
inferior to the "UPSERT looping subxact" pattern does (i.e. the plpsql
thing that the docs suggest).

When we left off, Heikki continued to favor an approach that involved
speculatively inserting heap tuples, and then deleting them in the
event of a conflict. This design was made more complicated when the
need to *release* value locks became apparent (Heikki ended up making
some changes to HeapTupleSatisfiesDirty(), as well as sketching a
design for what you might call a "super delete", where xmin can be set
to InvalidTransactionId for speculatively-inserted heap tuples). After
all, it wasn't as if we could abort a subxact to release locks, which
is what the "UPSERT looping subxact" pattern does. I think it's fair
to say that that design became more complicated than initially
anticipated [4] [5].

Anyway, the greater point here is that fundamentally, AFAICT Heikki
and I were in agreement. Once you buy into the idea that we must avoid
holding on to "value locks" of whatever form - as Heikki evidently did
- then exactly what form they take is ultimately only a detail.
Granted, it's a very important detail, but a detail nonetheless. It
can be discussed entirely independently of all of this new stuff, and
thank goodness for that.

If anyone finds my (virtually unchanged) page heavyweight lock based
value locking approach objectionable, I ask that the criticism be
framed in a way that makes a sharp distinction between each of the
following:

1. You don't accept that value locks must be easily released in the
event of a conflict. Is anyone in this camp? It's far from obvious to
me what side of this question Andres is on at this stage, for example.
Robert might have something to say here too.

2. Having taken into account the experience of myself and Heikki, and
all that is implied by taking that approach ***while avoiding
unprincipled deadlocks***, you continue to believe that an approach
based on speculative heap insertion, or some alternative scheme is
better than what I have done to the nbtree code here, or you otherwise
dislike something about the proposed value locking scheme. You accept
that value locks must be released and released easily in the event of
a conflict, but like Heikki you just don't like what I've done to get
there.

Since we can (I believe) talk about the value locking aspect and the
rest of the patch independently, we should do so...unless you're in
camp 1, in which case I guess that we'll have to thrash it out.

Syntax, footguns
=============

As I mentioned, I have incorporated feedback from Kevin Grittner. You
may specify a unique index to merge on from within the INSERT
statement, thus avoiding the risk of inadvertently having the update
affect the wrong tuple due to the user failing to consider that there
was a would-be unique violation within some other unique index
constraining some other attribute. You may write the DML statement
like this:

INSERT INTO upsert(key, val) VALUES(1, 'insert') ON CONFLICT WITHIN
upsert_pkey UPDATE SET val = 'update';

I think that there is a good chance that at least some people will
want to make this mandatory. I guess that's fair enough, but I
*really* don't want to *mandate* that users specify the name of their
unique index in DML for obvious reasons. Perhaps we can come up with a
more tasteful syntax that covers all interesting cases (consider the
issues with partial unique indexes and before triggers for example,
where a conclusion reached about which index to use during parse
analysis may subsequently be invalidated by user-defined code, or
ambiguous specifications in the face of overlapping attributes between
two unique composite indexes, etc). The Right Thing is far from
obvious, and there is very little to garner from other systems, since
SQL MERGE promises essentially nothing about concurrency, both as
specified by the standard and in practice. You don't need a unique
index at all, and as I showed in my pgCon talk, there are race
conditions even for a trivial UPSERT operations in all major SQL MERGE
implementations.

Note that making mandatory (via syntax) merging on one particular
unique index buys the implementation no useful leeway. Just for
example, the unprincipled deadlocks test case that illustrated the
problem with early "promise tuple" style approaches to value locking
[6] involved only a single unique index. AFAICT, the question of
whether or not this should be mandatory is just a detail of the
feature's high level design, as opposed to something expected to
significantly influence the implementation.

Testing, performance
===============

As you'd expect, I've included both isolation tests and regression
tests covering a reasonable variety of cases. In addition, stress
testing is an important part of my testing strategy. Reviewers are
encouraged to try out these test bash scripts:

https://github.com/petergeoghegan/upsert

(Interested hackers should request collaborator status on that Github
project from me privately. I welcome new, interesting test cases.)

The performance of the patch seems quite good, and is something that
these stress-testing bash scripts also test. Upserts are compared
against "equivalent" inserts when we know we'll never update, and
against "equivalent" updates when we know we'll never insert.

On an 8 core test server, I can sustain ~90,000 ordinary insert
transactions per second on an unlogged table defined as follows:

create unlogged table foo
(
  merge serial primary key,
  b int4,
  c text
);

In all cases pgbench uses 8 clients (1 per CPU core).

With "equivalent" upserts, it's about ~66,000 TPS. But this is a
particularly unsympathetic case, because I've deliberately exaggerated
the effects of heavyweight lock contention on leaf pages by using a
serial primary key. Plus, there's the additional planning and parsing
overhead.

When comparing updating with updating upserting, it's a similar story.
100,000 tuples are pre-inserted in each case. I can sustain ~98,000
TPS with plain updates, or ~70,000 TPS with "equivalent" upserts.
B-Tree index page heavyweight lock contention probably explains some
of the difference between "UPSERT inserts" and "UPSERT updates".

Interlocking with VACUUM, race conditions
===============================

In previous revisions, when we went to lock + update a tuple, no
"value locks" were held, and neither were any B-Tree page buffer pins,
because they were both released at the same time (recall that I call
my heavyweight lock on B-Tree leaf pages a value lock). We still do
that (unprincipled deadlocks are our only alternative), but now hold
on to the pin for longer, until after tuple locking. Old versions of
this patch used to sit on the B-Tree buffer pin to prevent concurrent
deletion only as long as value locks were held, but maybe it isn't
good enough to sit on the pin until before we lock/update, as value
locks are released: dropping the pin implies that the heap tuple can
physically go away, and in general the same TID may then contain
anything. We may have to interlock against vacuum by sitting on the
B-Tree buffer pin (but not the value lock) throughout locking +
update. That makes it impossible for the heap tuple slot to fail to
relate to the tuple from the B-Tree, that is under consideration for
locking/updating. Recall that we aren't quite dealing with MVCC
semantics here, since in READ COMMITTED mode we can lock a
conclusively committed + visible tuple with *no* version visible to
our command's MVCC snapshot. Therefore, it seems worth considering the
possibility that the nbtree README's observations on the necessity of
holding a pin to interlock against VACUUM (for non-MVCC snapshots)
apply.

In this revision we have two callbacks (or two calls to the same
callback, with different effects): One to release value locks early,
to avoid unprincipled deadlocks, and a second to finally release the
last unneeded buffer pin.

Recall that when we find a conflict (within _bt_checkunique()), it
must be conclusively committed and visible to new MVCC snapshots; we
know at that juncture that it's live. The concern is that it might be
deleted *and* garbage collected in the interim between finding the
conflict tuple, and locking it (in practice this interim period is
only an instant).

This is probably too paranoid, though: the fact that the upserter's
transaction is running ought to imply that GetOldestXmin() returns an
XID sufficient to prevent this. OTOH, I'm not sure that there exists
anything that looks like a precedent for relying on blocking vacuum in
this manner, and it might turn out to be limiting to rely on this.
And, I hasten to add, my fix (sitting on a B-Tree pin throughout row
locking) is in another way perhaps not paranoid enough: Who is to say
that our conflicting value is on the same B-Tree leaf page as our
value lock? If might not be, since _bt_checkunique() looks at later
B-Tree pages (the value locked page is merely "the first leaf page the
value could be on"). Pinning the heavyweight lock page's buffer is
certainly justified by the need for non-speculative inserters to see a
flag that obligates them to acquire the heavyweight page lock
themselves (see comments in patch for more), but this other reason is
kind of dubious.

In other words: I'm relying on the way VACUUM actually works to
prevent premature garbage collection. It's possible to imagine a world
in which HeapTupleSatisfiesVacuum() is smart enough to realize that
the tuple UPSERT wants to lock is not visible to anyone (assuming MVCC
semantics, etc), and never can be. I've tentatively added code to keep
a buffer pin for longer, but that's probably not good enough if we
assume that it's necessary at all. Basically, I want to be comfortable
about my rationale for it being okay that a "non-MVCC" "index scan"
doesn't hold a pin, but right now I'm not. I was conflicted on whether
or not I should include the "unpin later" logic at all; for now I've
left it in, if only as a placeholder. Needless to say, if there is a
race condition you can take it that it's very difficult to isolate.

FWIW, somewhat extensive stress-testing has revealed no bugs that you
might associate with these problems, with and without extended buffer
pinning, and with artificial random sleeps added at key points in an
effort to make any race condition bugs manifest themselves. I have
made a concerted effort to break the patch in that way, and I'm now
running out of ideas. Running the stress tests (with random delays in
key points in the code) for several days reveals no bugs. This is on
the same dedicated 8 core server, with plenty of concurrency.

It's probably a good idea to begin using my B-Tree verification tool
[7] for testing...on the other hand, it doesn't know anything about
MVCC, and will only detect the violation of invariants that are
localized to the B-Tree code, at least at the moment.

Open items
=========

I already mentioned the inability to reference rejected rows in an
UPDATE, as well as my unease about VACUUM interlocking, both of which
are open item. Also, some of the restrictions that I already mentioned
- on updatable views, inheritance, and foreign tables - are probably
unnecessary. We should be able to come with reasonable behavior for at
least some of those.

Patch
====

I thought that I went too long without posting something about all of
this to the list to get feedback, and so I decided to post this WIP
patch set. I've tried to break it up into pieces, but it isn't all
that suitable for representing as cumulative commits. I've also tried
to break up the discussion usefully (the question of how everything
fits together at a high level can hopefully be discussed separately
from the question of how "value locks" are actually implemented).

Thoughts?

[1] http://www.pgcon.org/2014/schedule/attachments/327_upsert_weird.pdf,
("Goals for UPSERT in Postgres")
[2] http://www.postgresql.org/message-id/CAM3SWZRP0c3g6+aJ=YYDGYAcTZg0xA8-1_FCVo5Xm7hrEL34kw@mail.gmail.com
[3] https://sqlite.org/lang_conflict.html
[4] http://www.postgresql.org/message-id/CAM3SWZQoArVQGMi=v-jk3sBjsPg+wdjeUkM_6L5TZG_i9pyGzQ@mail.gmail.com
[5] http://www.postgresql.org/message-id/52B4AAF0.5090806@vmware.com
[6] http://www.postgresql.org/message-id/CAM3SWZShbE29KpoD44cVc3vpZJGmDer6k_6FGHiSzeOZGmTFSQ@mail.gmail.com
[7] http://www.postgresql.org/message-id/CAM3SWZRtV+xmRWLWq6c-x7czvwavFdwFi4St1zz4dDgFH4yN4g@mail.gmail.com
--
Peter Geoghegan

On Thu, Aug 28, 2014 at 8:05 PM, Peter Geoghegan <pg@heroku.com> wrote:
> I realized that I missed a few cases here. For one thing, the posted
> patch fails to arrange for the UPDATE post-parse-analysis tree
> representation to go through the rewriter stage (on the theory that
> user-defined rules shouldn't be able to separately affect the
> auxiliary UPDATE query tree), but rewriting is at least necessary so
> that rewriteTargetListIU() can expand a "SET val = DEFAULT"
> targetlist, as well as normalize the ordering of the UPDATE's tlist.
> Separately, the patch fails to defend against certain queries that
> ought to be disallowed, where a subselect is specified with a subquery
> expression in the auxiliary UPDATE's WHERE clause.

Attached revision fixes all of these issues. I've added regression
tests for each bug, too, although all changes are rebased into my
original commits.

I decided to explicitly rely on a simpler approach to VACUUM
interlocking. I no longer bother holding on to a buffer pin for a
period longer than the period that associated "value locks" are held,
which was something I talked about at the start of this thread. There
is a note on this added to the nbtree README, just after the master
branch's current remarks on B-Tree VACUUM interlocking.

I've also pushed the responsibility for supporting this new feature on
foreign tables onto FDWs themselves. The only writable FDW we
currently ship, postgres_fdw, lacks support for the new feature, but
this can be revisited in due course. My impression is that the task of
adding support is not quite a straightforward matter of adding a bit
more deparsing logic, but also isn't significantly more difficult than
that.

--
Peter Geoghegan

On Wed, Aug 27, 2014 at 7:43 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Omission
> =======
>
> The patch currently lacks a way of referencing datums rejected for
> insertion when updating.

Attached revision of the patch set (which I'll call v1.2) adds this
capability in a separate commit. It now becomes possible to add a
CONFLICTING expression within the ON CONFLICT UPDATE targetlist or
predicate. Example use:

"""
postgres=# CREATE TABLE upsert(key int4 PRIMARY KEY, val text);
CREATE TABLE
postgres=# INSERT INTO upsert VALUES(1, 'Giraffe');
INSERT 0 1
postgres=# SELECT * FROM upsert;
 key |   val
-----+---------
   1 | Giraffe
(1 row)

postgres=# INSERT INTO upsert VALUES(1, 'Bear'), (2, 'Lion') ON
CONFLICT UPDATE SET val = CONFLICTING(val);
INSERT 0 1
postgres=# SELECT * FROM upsert;
 key | val
-----+------
   1 | Bear
   2 | Lion
(2 rows)

"""

Note that the effects of BEFORE INSERT triggers are carried here,
which I slightly favor over the alternative of not having it work that
way.

I've also expanded upon my explanation for the structure of the query
tree and plan within (revised/rebased versions of) earlier commits. I
am clearer on why there is a special subquery planning step for the
auxiliary UPDATE, rather than making the UPDATE directly accessible as
a subquery within the post-parse-analysis query tree. Basically, the
optimizer has no basis for understanding that a DML sublink isn't
optimizable. It'll try to pull-up the subquery and so on, which of
course does not and cannot work. Whereas treating it as an
independently planned subquery of the top-level query, kind of like a
data-modifying CTE makes sense (with such CTEs, the executor is
prepared for the possibility that not all rows will be pulled up - so
there too, the executor drives execution more directly than makes
sense when not dealing with DML: it finishes off the data-modifying
CTE's DML for any still-unconsumed tuples, within
ExecPostprocessPlan()).

It's certainly possible that a more unified representation makes sense
(i.e. one ModifyTable plan, likely still having seperate INSERT/UPDATE
representations at earlier stages of query processing), but that would
require serious refactoring of the representation of ModifyTable
operations -- just for example, consider the need for a
unified-though-separate targetlist, one for the INSERT part, the other
for the UPDATE part. For now, I continue to find it very convenient to
represent the UPDATE as a selectively executed, auxiliary, distinct
ModifyTable plan, rather than adding a subquery rangetable directly
during parse analysis.

There is another significant change. In this revision, I am at least
"honest" about the plan representation within EXPLAIN:

"""
postgres=# EXPLAIN ANALYZE INSERT INTO upsert VALUES(1, 'Bear'), (2,
'Lion') ON CONFLICT UPDATE SET val = CONFLICTING(val);
                                                    QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Insert on upsert  (cost=0.00..0.03 rows=2 width=36) (actual
time=0.115..0.115 rows=0 loops=1)
   ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 width=36)
(actual time=0.003..0.005 rows=2 loops=1)
   ->  Conflict Update on upsert  (cost=0.00..22.30 rows=1230
width=36) (actual time=0.042..0.051 rows=0 loops=1)
         ->  Seq Scan on upsert  (cost=0.00..22.30 rows=1230 width=36)
(never executed)
 Planning time: 0.065 ms
 Execution time: 0.158 ms
(6 rows)

postgres=# EXPLAIN ANALYZE INSERT INTO upsert VALUES(1, 'Bear'), (2,
'Lion') ON CONFLICT UPDATE SET val = CONFLICTING(val) where key = 2;
                                                  QUERY PLAN
--------------------------------------------------------------------------------------------------------------
 Insert on upsert  (cost=0.00..0.03 rows=2 width=36) (actual
time=0.075..0.075 rows=0 loops=1)
   ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 width=36)
(actual time=0.001..0.002 rows=2 loops=1)
   ->  Conflict Update on upsert  (cost=4.16..8.17 rows=1 width=36)
(actual time=0.012..0.026 rows=0 loops=1)
         ->  Bitmap Heap Scan on upsert  (cost=4.16..8.17 rows=1
width=36) (never executed)
               Recheck Cond: (key = 2)
               ->  Bitmap Index Scan on upsert_pkey  (cost=0.00..4.16
rows=1 width=0) (never executed)
                     Index Cond: (key = 2)
 Planning time: 0.090 ms
 Execution time: 0.125 ms
(9 rows)

"""

The second query gets a bitmap scan because plain index scans have
been disabled for the UPDATE (a temporary kludge), since index-only
scans can break things - IndexOnlyRecheck() throws an error. Not quite
sure why the optimizer doesn't care about resjunk for the UPDATE,
which is presumably why in general regular updates never use
index-only scans. Since I think the actual auxiliary plan generation
needs work, so as to not have uselessly complicated plans, I didn't
try too hard to figure that out. This plan structure is not
acceptable, of course, but maybe almost the same thing would be
acceptable if the auxiliary plan shown here wasn't unnecessarily
complex - if we forced a simple pseudo-scan placeholder, without
wasting optimizer cycles, somewhat in the style of WHERE CURRENT OF.
This is something discussed in newly expanded comments within
planner.c. I would have made the optimizer produce a suitably simple
plan myself, but I don't have a good enough understanding of it to
figure out how (at least in a reasonable amount of time). Pointers on
how this might be accomplished are very welcome.

With this addition, the feature is functionally complete. That just
leaves the small matter of how it has been implemented.  :-)

This is still clearly a work in progress implementation, with design
trade-offs that are very much in need of fairly high level discussion.
--
Peter Geoghegan

On Thu, Sep 25, 2014 at 1:48 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> I hate the fact
> that you have written no user facing documentation for this feature.

Attached patch adds a commit to the existing patchset. For the
convenience of reviewers, I've uploaded and made publicly accessible a
html build of the documentation. This page is of most interest:

http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html

See also:

http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/transaction-iso.html
http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/ddl-inherit.html
http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-createrule.html
http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/trigger-definition.html
http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-createtrigger.html
http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/index-unique-checks.html
http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-createview.html
http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/postgres-fdw.html

--
Peter Geoghegan

Attachment

0006-User-visible-documentation-for-INSERT-.-ON-CONFLICT-.patch

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

28 September 2014, 01:46:08

On Thu, Sep 25, 2014 at 1:48 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> At this stage, poll the Django and Rails communities for acceptance
> and early warning of these features. Listen.

FYI, I have asked for input from the Django developers here:

https://groups.google.com/forum/#!topic/django-developers/hdzkoLYVjBY

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Simon Riggs

Date:

28 September 2014, 06:21:46

On 27 September 2014 23:23, Peter Geoghegan <pg@heroku.com> wrote:
> On Thu, Sep 25, 2014 at 1:48 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> I hate the fact
>> that you have written no user facing documentation for this feature.
>
> Attached patch adds a commit to the existing patchset. For the
> convenience of reviewers, I've uploaded and made publicly accessible a
> html build of the documentation. This page is of most interest:
>
> http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html

My request was for the following...

Agree command semantics by producing these things
* Explanatory documentation (Ch6.4 Data Manipulation - Upsert)
* SQL Reference Documentation (INSERT)
* Test cases for feature
* Test cases for concurrency
* Test cases for pgbench

because it forces you to show in detail how the command works. Adding
a few paragraphs to the INSERT page with two quick examples is not the
same level of detail at all and leaves me with the strong impression
my input has been assessed as ON CONFLICT IGNORE.

Examples of the following are needed

"ON CONFLICT UPDATE optionally accepts a WHERE clause condition. When
provided, the statement only procedes with updating if the condition
is satisfied. Otherwise, unlike a conventional UPDATE, the row is
still locked for update. Note that the condition is evaluated last,
after a conflict has been identified as a candidate to update."
Question arising: do you need to specify location criteria, or is this
an additional filter? When/why would we want that?

"Failure to anticipate and prevent would-be unique violations
originating in some other unique index than the single unique index
that was anticipated as the sole source of would-be uniqueness
violations can result in updating a row other than an existing row
with conflicting values (if any)."
In English, please

How would you do "if colA = 3 then ignore else update"?

No explanation of why the CONFLICTING() syntax differs from OLD./NEW.
syntax used in triggers

The page makes no mention of the upsert problem, nor is any previous
code mentioned.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

28 September 2014, 07:41:03

On Sat, Sep 27, 2014 at 11:21 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> My request was for the following...
>
> Agree command semantics by producing these things
> * Explanatory documentation (Ch6.4 Data Manipulation - Upsert)

Do you really think I could get an entire chapter out of this?

> * SQL Reference Documentation (INSERT)
> * Test cases for feature
> * Test cases for concurrency

All of these were added. There are two new sets of isolation tests,
one per variant of the new clause (IGNORE/UPDATE).

> * Test cases for pgbench

They're not part of the patch proper, but as I've already mentioned I
have pgbench based stress-tests on Github. There is a variety of
test-cases that test the feature under high concurrency:

https://github.com/petergeoghegan/upsert

> Examples of the following are needed
>
> "ON CONFLICT UPDATE optionally accepts a WHERE clause condition.

Yes, I realized I missed an example of that one the second I hit
"send". The MVCC interactions of this are discussed within
transaction-iso.html, FWIW.

> Question arising: do you need to specify location criteria, or is this
> an additional filter? When/why would we want that?

It is an additional way to specify a predicate/condition to UPDATE on.
There might be a kind of redundancy, if you decided to repeat the
constrained values in the predicate too, but if you're using the WHERE
clause sensibly there shouldn't be. So your UPDATE's "full predicate"
is sort of the union of the constrained values that the conflict path
was taken for, plus whatever you put in the WHERE clause, but not
quite because they're evaluated at different times (as explained
within transaction-iso.html).

> How would you do "if colA = 3 then ignore else update"?

Technically, you can't do that exact thing. IGNORE is just for quickly
dealing with ETL-type problems (and it is reasonable to use it without
one particular unique index in mind, unlike ON CONFLICT UPDATE) -
think pgloader. But if you did this:

INSERT INTO tab(colB) values('foo') ON CONFLICT UPDATE set colB =
CONFLICTING(colB) WHERE colA != 3

Then you would achieve almost the same thing. You wouldn't have
inserted or updated anything if the only rows considered had a colA of
3, but any such rows considered would be locked, which isn't the same
as IGNOREing them.

> No explanation of why the CONFLICTING() syntax differs from OLD./NEW.
> syntax used in triggers

Why should it be the same?

> The page makes no mention of the upsert problem, nor is any previous
> code mentioned.

What's the upsert problem? I mean, apart from the fact that we don't
have it. Note that it is documented that one of the two outcomes is
guaranteed.

I should have updated the plpgsql looping subxact example, though.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Andreas Karlsson

Date:

28 September 2014, 11:25:12

On 09/28/2014 09:40 AM, Peter Geoghegan wrote:
>> No explanation of why the CONFLICTING() syntax differs from OLD./NEW.
>> syntax used in triggers
>
> Why should it be the same?

Both can be seen as cases where you refer to a field of a tuple, which 
is usually done with FOO.bar.

-- 
Andreas Karlsson

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Craig Ringer

Date:

28 September 2014, 12:56:57

On 09/28/2014 03:40 PM, Peter Geoghegan wrote:
> Do you really think I could get an entire chapter out of this?

Yes. It might be a short chapter, but once you extract the existing
upsert example from the docs and how why the naïve approach doesn't work
there'll be enough to go on.

People get this wrong a *lot*.

http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING
http://www.depesz.com/2012/06/10/why-is-upsert-so-complicated/
http://stackoverflow.com/q/17267417/398670
http://stackoverflow.com/q/1109061/398670

I'm happy to help with documenting it.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Simon Riggs

Date:

28 September 2014, 20:17:32

On 28 September 2014 08:40, Peter Geoghegan <pg@heroku.com> wrote:
> On Sat, Sep 27, 2014 at 11:21 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> My request was for the following...
>>
>> Agree command semantics by producing these things
>> * Explanatory documentation (Ch6.4 Data Manipulation - Upsert)

...

> INSERT INTO tab(colB) values('foo') ON CONFLICT UPDATE set colB =
> CONFLICTING(colB) WHERE colA != 3
>
> Then you would achieve almost the same thing. You wouldn't have
> inserted or updated anything if the only rows considered had a colA of
> 3, but any such rows considered would be locked, which isn't the same
> as IGNOREing them.
>
>> No explanation of why the CONFLICTING() syntax differs from OLD./NEW.
>> syntax used in triggers
>
> Why should it be the same?

Good question. What could be wrong with making up new syntax?

The obvious answer is because we would simply have nothing to guide
us. No principles that can be applied, just opinions.

My considered opinion is that the above syntax is
* non-standard
* inconsistent with what we have elsewhere
* an additional item for implementors to handle

I could use more emotive words here, but the above should suffice to
cover my unease at inventing new SQL constructs. This is Postgres.

What worries me the most is that ORM implementors everywhere will
simply ignore our efforts, leaving us with something we'd much rather
we didn't have. As a possible committer of this feature, I would not
wish to put my name to that. You will need one a committer who will do
that.

Which brings me back to the SQL Standard, which is MERGE. We already
know the MERGE command does not fully and usefully define its
concurrent behaviour; I raised this 6 years ago. It's not clear to me
that that we couldn't more closely define the behaviour for a subset
of the command.

If we implement MERGE, then we will help ORM developers do less work
to support Postgres, which will encourage adoption.

My proposal would be to implement only a very limited syntax for MERGE
in this release, replacing this

> INSERT INTO tab(colB) values('foo') ON CONFLICT UPDATE set colB =
> CONFLICTING(colB) WHERE colA != 3

with this...

MERGE INTO tab USING VALUES ('foo')
WHEN NOT MATCHED THENINSERT (colB)
WHEN MATCHED THENUPDATE SET colB = NEW.p1

and throwing "ERROR: full syntax for MERGE not implemented yet" if
people stretch too far.

If there is some deviation from the standard, it can be explained
clearly, though I don't see we would need to do that - we can extend
beyond the standard to explain the concurrent behaviour. And we will
be a lot closer to getting full MERGE also.

Doing MERGE syntax is probably about 2 weeks work, which is better
than 2 weeks per ORM to support the new Postgres-only syntax.

Thanks for your efforts to bring this to a conclusion.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

28 September 2014, 20:31:59

On Sun, Sep 28, 2014 at 1:17 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> MERGE INTO tab USING VALUES ('foo')
> WHEN NOT MATCHED THEN
>  INSERT (colB)
> WHEN MATCHED THEN
>  UPDATE SET colB = NEW.p1
>
> and throwing "ERROR: full syntax for MERGE not implemented yet" if
> people stretch too far.

That isn't the MERGE syntax either. Where is the join?

I've extensively discussed why I think we should avoid calling
something upsert-like MERGE, as you know:

http://www.postgresql.org/message-id/flat/CAM3SWZRP0c3g6+aJ=YYDGYAcTZg0xA8-1_FCVo5Xm7hrEL34kw@mail.gmail.com#CAM3SWZRP0c3g6+aJ=YYDGYAcTZg0xA8-1_FCVo5Xm7hrEL34kw@mail.gmail.com

We *should* have a MERGE feature, but one that serves the actual MERGE
use-case well. That is an important use-case; it just isn't the one
I'm interested in right now.

FWIW, I agree that it wouldn't be much work to do this - what you
present here really is just a different syntax for what I have here
(which isn't MERGE). I think it would be counter-productive to pursue
this, though. Also, what about limiting the unique indexes under
consideration?

There was informal meeting of this at the dev meeting a in 2012.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

28 September 2014, 21:22:57

On Sun, Sep 28, 2014 at 1:31 PM, Peter Geoghegan <pg@heroku.com> wrote:
> There was informal meeting of this at the dev meeting a in 2012.

I mean: There was informal agreement that as long as we're working on
a feature that makes useful, UPSERT-like guarantees, we shouldn't use
the MERGE syntax. MERGE clearly benefits (in ways only relevant to the
use-case it targets) from having the leeway to not care about what
someone with the UPSERT use-case would call race conditions.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Gavin Flower

Date:

28 September 2014, 22:42:14

On 29/09/14 09:31, Peter Geoghegan wrote:
> On Sun, Sep 28, 2014 at 1:17 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> MERGE INTO tab USING VALUES ('foo')
>> WHEN NOT MATCHED THEN
>>   INSERT (colB)
>> WHEN MATCHED THEN
>>   UPDATE SET colB = NEW.p1
>>
>> and throwing "ERROR: full syntax for MERGE not implemented yet" if
>> people stretch too far.
> That isn't the MERGE syntax either. Where is the join?
>
> I've extensively discussed why I think we should avoid calling
> something upsert-like MERGE, as you know:
>
http://www.postgresql.org/message-id/flat/CAM3SWZRP0c3g6+aJ=YYDGYAcTZg0xA8-1_FCVo5Xm7hrEL34kw@mail.gmail.com#CAM3SWZRP0c3g6+aJ=YYDGYAcTZg0xA8-1_FCVo5Xm7hrEL34kw@mail.gmail.com
>
> We *should* have a MERGE feature, but one that serves the actual MERGE
> use-case well. That is an important use-case; it just isn't the one
> I'm interested in right now.
>
> FWIW, I agree that it wouldn't be much work to do this - what you
> present here really is just a different syntax for what I have here
> (which isn't MERGE). I think it would be counter-productive to pursue
> this, though. Also, what about limiting the unique indexes under
> consideration?
>
> There was informal meeting of this at the dev meeting a in 2012.
>
How about have a stub page for MERGE, saying it is not implemented yet, 
but how about considering UPSERT - or something of that nature?

I can suspect that people are much more likely to look for 'MERGE' in an 
index, or look for 'MERGE' in the list of SQL commands, than 'UPSERT'.


Cheers,
Gavin

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

28 September 2014, 22:58:05

On Sun, Sep 28, 2014 at 3:41 PM, Gavin Flower
<GavinFlower@archidevsys.co.nz> wrote:
> How about have a stub page for MERGE, saying it is not implemented yet, but
> how about considering UPSERT - or something of that nature?
>
> I can suspect that people are much more likely to look for 'MERGE' in an
> index, or look for 'MERGE' in the list of SQL commands, than 'UPSERT'.

Seems reasonable.

What I have a problem with is using the MERGE syntax to match people's
preexisting confused ideas about what MERGE does. If we do that, it'll
definitely bite us when we go to make what we'd be calling MERGE do
what MERGE is actually supposed to do. I favor clearly explaining
that.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Gavin Flower

Date:

29 September 2014, 01:16:19

On 29/09/14 11:57, Peter Geoghegan wrote:
> On Sun, Sep 28, 2014 at 3:41 PM, Gavin Flower
> <GavinFlower@archidevsys.co.nz> wrote:
>> How about have a stub page for MERGE, saying it is not implemented yet, but
>> how about considering UPSERT - or something of that nature?
>>
>> I can suspect that people are much more likely to look for 'MERGE' in an
>> index, or look for 'MERGE' in the list of SQL commands, than 'UPSERT'.
> Seems reasonable.
>
> What I have a problem with is using the MERGE syntax to match people's
> preexisting confused ideas about what MERGE does. If we do that, it'll
> definitely bite us when we go to make what we'd be calling MERGE do
> what MERGE is actually supposed to do. I favor clearly explaining
> that.
>
Opinionated I may be, but I wanted stay well clear of the syntax 
minefield in this area - as I still have at least a vestigial instinct 
for self preservation!  :-)

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

29 September 2014, 01:20:59

On Sun, Sep 28, 2014 at 6:15 PM, Gavin Flower
<GavinFlower@archidevsys.co.nz> wrote:
>> What I have a problem with is using the MERGE syntax to match people's
>> preexisting confused ideas about what MERGE does. If we do that, it'll
>> definitely bite us when we go to make what we'd be calling MERGE do
>> what MERGE is actually supposed to do. I favor clearly explaining
>> that.
>>
> Opinionated I may be, but I wanted stay well clear of the syntax minefield
> in this area - as I still have at least a vestigial instinct for self
> preservation!  :-)

To be clear: I don't think Simon is confused about this at all, which
is why I'm surprised that he suggested it.


-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Gavin Flower

Date:

29 September 2014, 01:38:19

On 29/09/14 14:20, Peter Geoghegan wrote:
> On Sun, Sep 28, 2014 at 6:15 PM, Gavin Flower
> <GavinFlower@archidevsys.co.nz> wrote:
>>> What I have a problem with is using the MERGE syntax to match people's
>>> preexisting confused ideas about what MERGE does. If we do that, it'll
>>> definitely bite us when we go to make what we'd be calling MERGE do
>>> what MERGE is actually supposed to do. I favor clearly explaining
>>> that.
>>>
>> Opinionated I may be, but I wanted stay well clear of the syntax minefield
>> in this area - as I still have at least a vestigial instinct for self
>> preservation!  :-)
> To be clear: I don't think Simon is confused about this at all, which
> is why I'm surprised that he suggested it.
>
>
More specifically, I have only lightly read this thread - and while I 
think the functionality is useful, I have not thought about it any real 
depth.  I was thinking more along the lines that if I needed 
functionality like this, where & how might I look for it.

I was remembering my problems looking up syntax in COBOL after coming 
from FORTRAN (& other languages) - some concepts had different names and 
the philosophy was significantly different in places.  The relevance 
here, is that peoples' background in other DBMS & knowledge of SQL 
standards affect what they expect, as well as preventing unnecessary 
conflicts between PostgreSQL & SQL standards (as far as is practicable & 
sensible).

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Craig Ringer

Date:

29 September 2014, 03:53:56

On 09/29/2014 06:41 AM, Gavin Flower wrote:
> 
> I can suspect that people are much more likely to look for 'MERGE' in an
> index, or look for 'MERGE' in the list of SQL commands, than 'UPSERT'.

and/or to be looking for MySQL's:
 ON DUPLICATE KEY {IGNORE|UPDATE}

What astonishes me when I look around at how other RDBMS users solve
this is how many of them completely ignore concurrency issues. e.g. in
this SO question:

http://stackoverflow.com/q/108403/398670

there's an alarming lack of concern for concurrency, just a couple of
links to :

http://www.mssqltips.com/sqlservertip/3074/use-caution-with-sql-servers-merge-statement/

(BTW, that article contains some useful information about corner cases
any upsert approach should test and deal with).

Similar with Oracle: Alarming lack of concern for concurrency among users:

http://stackoverflow.com/q/237327/398670

Useful article:

http://michaeljswart.com/2011/09/mythbusting-concurrent-updateinsert-solutions/

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

29 September 2014, 04:03:27

On Sun, Sep 28, 2014 at 8:53 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> there's an alarming lack of concern for concurrency, just a couple of
> links to :
>
> http://www.mssqltips.com/sqlservertip/3074/use-caution-with-sql-servers-merge-statement/
>
> (BTW, that article contains some useful information about corner cases
> any upsert approach should test and deal with).

Did you find some of those links from my pgCon slides, or
independently? I'm well aware of those issues, FWIW. Avoiding
repeating the mistakes of others is something that I thought about
from an early stage.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Craig Ringer

Date:

29 September 2014, 04:14:48

On 09/29/2014 12:03 PM, Peter Geoghegan wrote:
> On Sun, Sep 28, 2014 at 8:53 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
>> there's an alarming lack of concern for concurrency, just a couple of
>> links to :
>>
>> http://www.mssqltips.com/sqlservertip/3074/use-caution-with-sql-servers-merge-statement/
>>
>> (BTW, that article contains some useful information about corner cases
>> any upsert approach should test and deal with).
> 
> Did you find some of those links from my pgCon slides, or
> independently? I'm well aware of those issues, FWIW. Avoiding
> repeating the mistakes of others is something that I thought about
> from an early stage.

Independently. I'm very glad to see you've looked over those issues.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Heikki Linnakangas

Date:

29 September 2014, 06:52:24

On 09/28/2014 11:31 PM, Peter Geoghegan wrote:
> On Sun, Sep 28, 2014 at 1:17 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> MERGE INTO tab USING VALUES ('foo')
>> WHEN NOT MATCHED THEN
>>   INSERT (colB)
>> WHEN MATCHED THEN
>>   UPDATE SET colB = NEW.p1
>>
>> and throwing "ERROR: full syntax for MERGE not implemented yet" if
>> people stretch too far.
>
> That isn't the MERGE syntax either. Where is the join?
>
> I've extensively discussed why I think we should avoid calling
> something upsert-like MERGE, as you know:
>
http://www.postgresql.org/message-id/flat/CAM3SWZRP0c3g6+aJ=YYDGYAcTZg0xA8-1_FCVo5Xm7hrEL34kw@mail.gmail.com#CAM3SWZRP0c3g6+aJ=YYDGYAcTZg0xA8-1_FCVo5Xm7hrEL34kw@mail.gmail.com
>
> We *should* have a MERGE feature, but one that serves the actual MERGE
> use-case well. That is an important use-case; it just isn't the one
> I'm interested in right now.

I agree we should not use the MERGE keyword for this. The upsert feature 
has tighter concurrency requirements than the SQL MERGE command, and 
that might come back to bite us. It would be highly confusing if some 
variants of MERGE are concurrency-safe and others are not, but if we now 
promise that our MERGE command is always concurrency-safe, that promise 
might be difficult to keep for the full MERGE syntax, and for whatever 
extensions the SQL committee comes up in the future.

That said, it would be handy if the syntax was closer to MERGE. Aside 
from the concurrency issues, it does the same thing, right? So how about 
making the syntax identical to MERGE, except for swapping the MERGE 
keyword with e.g. UPSERT?

- Heikki

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Andres Freund

Date:

29 September 2014, 07:02:42

On 2014-09-29 09:51:45 +0300, Heikki Linnakangas wrote:
> That said, it would be handy if the syntax was closer to MERGE. Aside from
> the concurrency issues, it does the same thing, right? So how about making
> the syntax identical to MERGE, except for swapping the MERGE keyword with
> e.g. UPSERT?

I don't think that's a good idea. What most people are missing is an
*easy* way to do upsert, that's similar to the normal INSERT. Not
something with a pretty different syntax. That's why INSERT OR REPLACE
and stuff like that was well adopted.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Simon Riggs

Date:

29 September 2014, 08:10:34

On 28 September 2014 21:31, Peter Geoghegan <pg@heroku.com> wrote:
> On Sun, Sep 28, 2014 at 1:17 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> MERGE INTO tab USING VALUES ('foo')
>> WHEN NOT MATCHED THEN
>>  INSERT (colB)
>> WHEN MATCHED THEN
>>  UPDATE SET colB = NEW.p1
>>
>> and throwing "ERROR: full syntax for MERGE not implemented yet" if
>> people stretch too far.

> I've extensively discussed why I think we should avoid calling
> something upsert-like MERGE, as you know:
>
http://www.postgresql.org/message-id/flat/CAM3SWZRP0c3g6+aJ=YYDGYAcTZg0xA8-1_FCVo5Xm7hrEL34kw@mail.gmail.com#CAM3SWZRP0c3g6+aJ=YYDGYAcTZg0xA8-1_FCVo5Xm7hrEL34kw@mail.gmail.com
>
> We *should* have a MERGE feature, but one that serves the actual MERGE
> use-case well. That is an important use-case; it just isn't the one
> I'm interested in right now.
>
> FWIW, I agree that it wouldn't be much work to do this - what you
> present here really is just a different syntax for what I have here
> (which isn't MERGE). I think it would be counter-productive to pursue
> this, though. Also, what about limiting the unique indexes under
> consideration?
>
> There was informal meeting of this at the dev meeting a in 2012.

I agreed with the initial proposition to go for a different syntax.

Now that I see the new syntax, I have changed my mind. The new syntax
is much worse, I am sorry to say.

MERGE standard does not offer guidance on concurrent effects, but
there is no confusion as to how it works. We can impose our own
concurrency rules since those are not covered by the standard. These
are quite clear for single row inputs anyway, i.e. a VALUES clause.

>
> That isn't the MERGE syntax either. Where is the join?
>

There doesn't need to be one. INSERT assumes that if a column list is
not mentioned then the VALUES clause is joined directly to the table,
so we can do the same thing for MERGE.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Simon Riggs

Date:

29 September 2014, 09:10:38

On 29 September 2014 02:20, Peter Geoghegan <pg@heroku.com> wrote:
> On Sun, Sep 28, 2014 at 6:15 PM, Gavin Flower
> <GavinFlower@archidevsys.co.nz> wrote:
>>> What I have a problem with is using the MERGE syntax to match people's
>>> preexisting confused ideas about what MERGE does. If we do that, it'll
>>> definitely bite us when we go to make what we'd be calling MERGE do
>>> what MERGE is actually supposed to do. I favor clearly explaining
>>> that.
>>>
>> Opinionated I may be, but I wanted stay well clear of the syntax minefield
>> in this area - as I still have at least a vestigial instinct for self
>> preservation!  :-)
>
> To be clear: I don't think Simon is confused about this at all, which
> is why I'm surprised that he suggested it.

At this point, I started to discuss MERGE again, but let me stop
because there is a wider issue.

These threads are littered with references that go nowhere. Links back
to an email where you said the same thing two years ago are not proof
that its a bad idea. You need to carefully explain things in detail in
one place to allow people to make up their own minds, not just
re-assert it endlessly and claim 3 friends also agree, while everyone
else searches desperately for what the actual reasons are. Lists of
problems with MERGE statement, with examples are what is needed to
convince and keep us convinced. Then full documentation on the
proposed solution, so we can see that also.

Please go to some trouble to tidy things up so we have clarity that
*we* can see and decide for ourselves whether or not you are correct.

Thanks

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Simon Riggs

Date:

29 September 2014, 09:14:31

On 29 September 2014 08:02, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-09-29 09:51:45 +0300, Heikki Linnakangas wrote:
>> That said, it would be handy if the syntax was closer to MERGE. Aside from
>> the concurrency issues, it does the same thing, right? So how about making
>> the syntax identical to MERGE, except for swapping the MERGE keyword with
>> e.g. UPSERT?
>
> I don't think that's a good idea. What most people are missing is an
> *easy* way to do upsert, that's similar to the normal INSERT. Not
> something with a pretty different syntax. That's why INSERT OR REPLACE
> and stuff like that was well adopted.

We have 3 choices...

1. SQL Standard MERGE (or a subset)
2. MySQL Compatible syntax
3. Something completely different

If we go for (3), I would like to see a long and detailed explanation
of what is wrong with (1) and (2) before we do (3). That needs to be
clear, detailed, well researched, correct and agreed. Otherwise when
we release such a feature, people will ask, why did you do that? And
yet nobody will remember.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Craig Ringer

Date:

29 September 2014, 09:28:24

On 09/29/2014 05:10 PM, Simon Riggs wrote:
> 
> Please go to some trouble to tidy things up so we have clarity that
> *we* can see and decide for ourselves whether or not you are correct.

Are you suggesting a wiki page to document the issues, discussions
around each issue, etc? A summary mail? Something else?

We have https://wiki.postgresql.org/wiki/SQL_MERGE but it's outdated,
pretty sparse, and not really about the current work.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Simon Riggs

Date:

29 September 2014, 13:59:56

On 29 September 2014 10:27, Craig Ringer <craig@2ndquadrant.com> wrote:
> On 09/29/2014 05:10 PM, Simon Riggs wrote:
>>
>> Please go to some trouble to tidy things up so we have clarity that
>> *we* can see and decide for ourselves whether or not you are correct.
>
> Are you suggesting a wiki page to document the issues, discussions
> around each issue, etc? A summary mail? Something else?

Something that can be edited to keep it up to date, yes.

> We have https://wiki.postgresql.org/wiki/SQL_MERGE but it's outdated,
> pretty sparse, and not really about the current work.

I rest my case.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Simon Riggs

Date:

29 September 2014, 14:21:44

On 28 September 2014 08:40, Peter Geoghegan <pg@heroku.com> wrote:
> On Sat, Sep 27, 2014 at 11:21 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> My request was for the following...
>>
>> Agree command semantics by producing these things
>> * Explanatory documentation (Ch6.4 Data Manipulation - Upsert)
>
> Do you really think I could get an entire chapter out of this?

If you were an ORM developer reading the PostgreSQL Release Notes for
9.5, which URL would you visit to see a complete description of the
new feature, including how it works concurrently, locking and other
aspects. How would you check whether some strange behaviour was a bug,
or intentional?

The new docs are scattered across many pages and there are very few
examples. It was very difficult to read like that.

>> * SQL Reference Documentation (INSERT)
>> * Test cases for feature
>> * Test cases for concurrency
>
> All of these were added. There are two new sets of isolation tests,
> one per variant of the new clause (IGNORE/UPDATE).

When you say "added", what do you mean? You posted one new doc patch,
with no tests in it.

>> Question arising: do you need to specify location criteria, or is this
>> an additional filter? When/why would we want that?
>
> It is an additional way to specify a predicate/condition to UPDATE on.
> There might be a kind of redundancy, if you decided to repeat the
> constrained values in the predicate too, but if you're using the WHERE
> clause sensibly there shouldn't be. So your UPDATE's "full predicate"
> is sort of the union of the constrained values that the conflict path
> was taken for, plus whatever you put in the WHERE clause, but not
> quite because they're evaluated at different times (as explained
> within transaction-iso.html).

I think we should leave that out of the first commit. I'm not sure why
that exists. If you wish to push down that route, then I recommend
using the MERGE syntax because it caters for this much better than
this.

>> How would you do "if colA = 3 then ignore else update"?
>
> Technically, you can't do that exact thing. IGNORE is just for quickly
> dealing with ETL-type problems (and it is reasonable to use it without
> one particular unique index in mind, unlike ON CONFLICT UPDATE) -
> think pgloader. But if you did this:
>
> INSERT INTO tab(colB) values('foo') ON CONFLICT UPDATE set colB =
> CONFLICTING(colB) WHERE colA != 3
>
> Then you would achieve almost the same thing. You wouldn't have
> inserted or updated anything if the only rows considered had a colA of
> 3, but any such rows considered would be locked, which isn't the same
> as IGNOREing them.
>
>> No explanation of why the CONFLICTING() syntax differs from OLD./NEW.
>> syntax used in triggers
>
> Why should it be the same?

Because it would be a principled approach to do that.

If we aren't going to use MERGE syntax, it would make sense to at
least use the same terminology.

e.g.
INSERT ....
WHEN MATCHED
UPDATE

The concept of "matched" is identical between MERGE and UPSERT and it
will be confusing to have two words for the same thing.

There seems to be a good reason not to use the MySQL syntax of ON
DUPLICATE KEY UPDATE, which doesn't allow you to specify UPDATE
operations other than a replace, so no deltas, e.g. SET a = a + x

Having said that, it would be much nicer to have a mode that allows
you to just say the word "UPDATE" and have it copy the data into the
correct columns, like MySQL does. That is very intuitive, even if it
isn't very flexible.

>> The page makes no mention of the upsert problem, nor is any previous
>> code mentioned.
>
> What's the upsert problem? I mean, apart from the fact that we don't
> have it. Note that it is documented that one of the two outcomes is
> guaranteed.
>
> I should have updated the plpgsql looping subxact example, though.

That's what I meant.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Robert Haas

Date:

29 September 2014, 15:32:01

On Fri, Sep 26, 2014 at 5:40 PM, Peter Geoghegan <pg@heroku.com> wrote:
> I will be frank. Everyone knows that the nbtree locking parts of this
> are never going to be committed over your objections. It cannot
> happen. And yet, I persist in proposing that we go that way. I may be
> stubborn, but I am not so stubborn that I'd jeopardize all the work
> I've put into this to save one aspect of it that no one really cares
> about anyway (even I only care about meeting my goals for user visible
> behavior [2]). I may actually come up with a better way to make what
> you outline work; then again, I may not. I have no idea, to be honest.
> It's pretty clear that I'm going to have a hard time getting your
> basic approach to value locking accepted without rethinking it a lot,
> though. Can you really say that you won't have serious misgivings
> about something like the "tuple->xmin = InvalidTransactionId"
> swapping, if I actually formally propose it? That's very invasive to a
> lot of places. And right now, I have no idea how we could do better.
>
> I really only want to get to where we have a design that's acceptable.
> In all sincerity, I may yet be convinced to go your way. It's possible
> that I've failed to fully understand your concerns. Is it really just
> about making INSERT ... ON CONFLICT IGNORE work with exclusion
> constraints (UPDATE clearly makes little sense)?

I'll be frank, too.  Heikki doesn't need to persuade you to go his
way, because everyone other than yourself who has looked at this
problem has come up with a design that looks like his.  That includes,
but is not limited to, every committer who has looked at this.  The
burden of proof is on you to convince everyone else that the promise
tuple approach is wrong, not on everyone else to convince you that
it's right.  This is a community, and it operates by consensus.  Your
opinion, no matter how strongly held in the face of opposition, is not
a consensus.

As far as finding an option that's better than clearing the xmin, the
point is not that we'd commit that design.  Well, we might, if
somebody does a careful audit of all the relevant code paths and makes
a convincing argument that it's safe.  But more likely, somebody will
go find some other bit space that can be used to do this.  The fact
that it's not immediately obvious to you (or Heikki) where to find
that bit-space is not a principled argument for changing the whole
design.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

29 September 2014, 17:37:12

On Mon, Sep 29, 2014 at 8:31 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> I'll be frank, too.  Heikki doesn't need to persuade you to go his
> way, because everyone other than yourself who has looked at this
> problem has come up with a design that looks like his.

Andres suggested something that is very roughly comparable, perhaps.
And that was it, really, except for your suggestion that I convinced
you wasn't the best way forward (for unrelated reasons).

> As far as finding an option that's better than clearing the xmin, the
> point is not that we'd commit that design.  Well, we might, if
> somebody does a careful audit of all the relevant code paths and makes
> a convincing argument that it's safe.  But more likely, somebody will
> go find some other bit space that can be used to do this.  The fact
> that it's not immediately obvious to you (or Heikki) where to find
> that bit-space is not a principled argument for changing the whole
> design.

I never said that it was.

*Obviously* I know that Heikki is not obligated to convince me of
anything - I said as much. Whether or not Heikki is obligated to
convince me is not the point, which is that it would be nice if he
could convince me. I think that there are some serious issues with the
promise tuples approach, and discussing those brings us closer to
moving forward.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

29 September 2014, 17:40:10

On Mon, Sep 29, 2014 at 10:37 AM, Peter Geoghegan <pg@heroku.com> wrote:
> But more likely, somebody will
>> go find some other bit space that can be used to do this.

My concerns have nothing to do with the availability of bit space, obviously.


-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

29 September 2014, 17:59:06

On Mon, Sep 29, 2014 at 7:21 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> If you were an ORM developer reading the PostgreSQL Release Notes for
> 9.5, which URL would you visit to see a complete description of the
> new feature, including how it works concurrently, locking and other
> aspects. How would you check whether some strange behaviour was a bug,
> or intentional?

We don't do that with UPDATE, so why would we do it with this? There
is an existing structure to the documentation that needs to be
respected. This is the case even though the EvalPlanQual() mechanism
is a total Postgres-ism, which can potentially violate snapshot
isolation (this is not true of Oracle's READ COMMITTED, for example).
You have to go out of your way to find that out at the moment. But I
know ORM authors, and the majority probably don't understand this
stuff - that ought to be okay.

>> All of these were added. There are two new sets of isolation tests,
>> one per variant of the new clause (IGNORE/UPDATE).
>
> When you say "added", what do you mean? You posted one new doc patch,
> with no tests in it.

I mean that there was a commit (not included with the documentation,
but with the original patchset) with many tests. I don't know why
you're suggesting that I don't have "concurrency tests". There are
isolation tests in that commit. There are also many regression tests.

>> It is an additional way to specify a predicate/condition to UPDATE on.
>> There might be a kind of redundancy, if you decided to repeat the
>> constrained values in the predicate too, but if you're using the WHERE
>> clause sensibly there shouldn't be. So your UPDATE's "full predicate"
>> is sort of the union of the constrained values that the conflict path
>> was taken for, plus whatever you put in the WHERE clause, but not
>> quite because they're evaluated at different times (as explained
>> within transaction-iso.html).
>
> I think we should leave that out of the first commit. I'm not sure why
> that exists. If you wish to push down that route, then I recommend
> using the MERGE syntax because it caters for this much better than
> this.

Why leave it out? People are going to "push the predicate into the
targetlist" if I do, and the effect is exactly the same.

>>> No explanation of why the CONFLICTING() syntax differs from OLD./NEW.
>>> syntax used in triggers
>>
>> Why should it be the same?
>
> Because it would be a principled approach to do that.

That is just an assertion. The MERGE syntax doesn't use that either.

> If we aren't going to use MERGE syntax, it would make sense to at
> least use the same terminology.
>
> e.g.
> INSERT ....
> WHEN MATCHED
> UPDATE
>
> The concept of "matched" is identical between MERGE and UPSERT and it
> will be confusing to have two words for the same thing.

I don't care if we change the spelling to "WHEN MATCHED
UPDATE/IGNORE". That seems fine. But MERGE is talking about a join,
not the presence of a would-be duplicate violation.

> There seems to be a good reason not to use the MySQL syntax of ON
> DUPLICATE KEY UPDATE, which doesn't allow you to specify UPDATE
> operations other than a replace, so no deltas, e.g. SET a = a + x

That isn't true, actually. It clearly does.

> Having said that, it would be much nicer to have a mode that allows
> you to just say the word "UPDATE" and have it copy the data into the
> correct columns, like MySQL does. That is very intuitive, even if it
> isn't very flexible.

Multi-assignment updates (with or without CONFLICTING()) are supported, FWIW.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

29 September 2014, 18:15:40

On Mon, Sep 29, 2014 at 12:02 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-09-29 09:51:45 +0300, Heikki Linnakangas wrote:
>> That said, it would be handy if the syntax was closer to MERGE. Aside from
>> the concurrency issues, it does the same thing, right? So how about making
>> the syntax identical to MERGE, except for swapping the MERGE keyword with
>> e.g. UPSERT?
>
> I don't think that's a good idea. What most people are missing is an
> *easy* way to do upsert, that's similar to the normal INSERT. Not
> something with a pretty different syntax. That's why INSERT OR REPLACE
> and stuff like that was well adopted.

Agreed.

MERGE isn't the same other than the concurrency concerns, in any case.
It is driven by a join, which is very flexible, but also has problems
with concurrency (leaving aside the fact that in practice it doesn't
tend to work out well when it isn't an equi-join). UPSERT *has* to be
driven by something like a would-be unique violation, not an outer
join matching or not matching.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

29 September 2014, 18:29:57

On Mon, Sep 29, 2014 at 2:27 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
>> Please go to some trouble to tidy things up so we have clarity that
>> *we* can see and decide for ourselves whether or not you are correct.
>
> Are you suggesting a wiki page to document the issues, discussions
> around each issue, etc? A summary mail? Something else?

It isn't easy, Simon. I thought my big e-mail at the start of the
thread was a summary.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

29 September 2014, 19:07:50

On Mon, Sep 29, 2014 at 2:14 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> 1. SQL Standard MERGE (or a subset)
> 2. MySQL Compatible syntax
> 3. Something completely different
>
> If we go for (3), I would like to see a long and detailed explanation
> of what is wrong with (1) and (2) before we do (3). That needs to be
> clear, detailed, well researched, correct and agreed. Otherwise when
> we release such a feature, people will ask, why did you do that? And
> yet nobody will remember.

My syntax is inspired by the MySQL one, with some influence from
SQLite (SQLite have an ON CONFLICT REPLACE). I don't want to copy
MySQL's use of VALUES() in the UPDATE targetlist - I spell the same
concept as CONFLICTING(). I guess that otherwise they'd have to make
the VALUES()/CONFLICTING() expression a whole new fully reserved
keyword, and preferred not to. Also, MySQL bizarrely omits the "SET"
keyword within ON DUPLICATE KEY UPDATE. So I haven't copied it exactly
on aesthetic grounds. I think that the actual reason for the latter
wart (the SET omission) is that MySQL found it easier to write the
grammar that way. Consider what we do here to make SET in an UPDATE
work, despite the fact that it's a valid column name:

https://github.com/postgres/postgres/blob/REL9_4_STABLE/src/backend/parser/gram.y#L10141

So I wanted to suggest something similar but not identical to the
MySQL syntax, with a bit more flexibility/safety. I thought that I
could do so without emulating their warts.

As I've mentioned, it isn't the MERGE syntax because that is quite a
different thing. There is a place for it, but it's not strategically
important in the same way as upsert is.
-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Kevin Grittner

Date:

29 September 2014, 20:40:45

Peter Geoghegan <pg@heroku.com> wrote:

> As I've mentioned, it isn't the MERGE syntax because that is
> quite a different thing. There is a place for it, but it's not
> strategically important in the same way as upsert is.

I think that the subset of the MERGE syntax that would be needed
for UPSERT behavior would be as follows.  For one row as literals:
 MERGE INTO tab t   USING (VALUES ('foo', 'p1')) new(id, colB)   ON (t.id = new.id)   WHEN MATCHED THEN     UPDATE SET
colB= new.colB   WHEN NOT MATCHED THEN     INSERT (id, colB) VALUES (new.id, new.colB);

If you have a bunch of rows in a "bar" table you want to merge in:
 MERGE INTO tab t   USING (SELECT id, colB FROM bar) b   ON (t.id = b.id)   WHEN MATCHED THEN     UPDATE SET colB =
b.colB  WHEN NOT MATCHED THEN     INSERT (id, colB) VALUES (b.id, b.colB);

I fail to see how this is harder or more problematic than the
nonstandard suggestions that have been floated.  I don't know why
we would be even *considering* a nonstandard syntax rather than
saying that only this subset is supported *so far*.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

29 September 2014, 20:49:49

On Mon, Sep 29, 2014 at 1:40 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
> I think that the subset of the MERGE syntax that would be needed
> for UPSERT behavior would be as follows.  For one row as literals:
>
>   MERGE INTO tab t
>     USING (VALUES ('foo', 'p1')) new(id, colB)
>     ON (t.id = new.id)
>     WHEN MATCHED THEN
>       UPDATE SET colB = new.colB
>     WHEN NOT MATCHED THEN
>       INSERT (id, colB) VALUES (new.id, new.colB);
>
> If you have a bunch of rows in a "bar" table you want to merge in:
>
>   MERGE INTO tab t
>     USING (SELECT id, colB FROM bar) b
>     ON (t.id = b.id)
>     WHEN MATCHED THEN
>       UPDATE SET colB = b.colB
>     WHEN NOT MATCHED THEN
>       INSERT (id, colB) VALUES (b.id, b.colB);
>
> I fail to see how this is harder or more problematic than the
> nonstandard suggestions that have been floated.  I don't know why
> we would be even *considering* a nonstandard syntax rather than
> saying that only this subset is supported *so far*.

Heikki, Andres and I are against using MERGE for this, fwiw. Tom
seemed to think so too, on previous occasions. It isn't a matter of
alternative syntaxes. I have described in detail why I think it's a
bad idea - I have linked to that about 3 times in this thread. It
paints us into a corner when we go to make this do what MERGE is
supposed to do. Do you want a feature that, when fully generalized,
plays a special visibility game based on whether or not some exact set
of conditions are met? That is a non-starter, IMV.

The whole idea of using an arbitrary join syntax seems great, but I
need something that works backwards from would-be unique violations.
That's the only way to preserve the UPSERT guarantees (atomicity,
definite insert or update).

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Kevin Grittner

Date:

29 September 2014, 21:20:46

Peter Geoghegan <pg@heroku.com> wrote:

> Heikki, Andres and I are against using MERGE for this, fwiw. Tom
> seemed to think so too, on previous occasions. It isn't a matter
> of alternative syntaxes. I have described in detail why I think
> it's a bad idea - I have linked to that about 3 times in this
> thread.

Yeah, I read that, and I'm not convinced.

> It paints us into a corner when we go to make this do what MERGE
> is supposed to do. Do you want a feature that, when fully
> generalized, plays a special visibility game based on whether or
> not some exact set of conditions are met? That is a non-starter,
> IMV.

For other queries we use different access techniques not only
based on the presence of an index, but on the state of the
visibility map, degree of bloat, ordering of tuples in a heap,
etc. -- so sure, I'm OK with different execution styles based on
whether your join conditions match a unique index on columns that
can't be NULL.

> The whole idea of using an arbitrary join syntax seems great,
> but I need something that works backwards from would-be unique
> violations. That's the only way to preserve the UPSERT guarantees
> (atomicity, definite insert or update).

I absolutely don't buy that it is the *only way*.  It is probably
(by far) the *easiest* way, and doing so gets us a frequently-
requested feature; but I think limiting the initial implementation
to cases where the join conditions include equality tests on all
columns of some appropriate unique index is fine, and doesn't seem
to me to preclude further development of the MERGE feature for
additional cases.  In fact, I think having something to build on is
a plus.

The claims that you can't get a duplicate key error with an UPSERT
are completely bogus, IMV.  The *best* you can do is avoid them on
the index used for matching (unless you're willing to ignore
problem input rows or mangle the data in surprising ways to avoid
such an error on a second unique index).  With a fully functional
MERGE syntax you could eventually gain the ability to write
exceptions like that to the location of your choice (be it a table
or WARNING messages in the log).

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

29 September 2014, 21:31:26

On Mon, Sep 29, 2014 at 2:20 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
> The claims that you can't get a duplicate key error with an UPSERT
> are completely bogus, IMV.  The *best* you can do is avoid them on
> the index used for matching (unless you're willing to ignore
> problem input rows or mangle the data in surprising ways to avoid
> such an error on a second unique index).

That's what I meant. Doing any more than that isn't useful. I want to
do exactly that - no more, no less.

If you're still not convinced, then I think the fact that no MERGE
implementation does what you want should be convincing. It is
*horrifically* complicated to make what you want work, if indeed it is
technically feasible at all. Isn't this already complicated enough?

We use different access techniques as you say. We do not use different
types of snapshots. That seems like a pretty fundamental distinction.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Kevin Grittner

Date:

29 September 2014, 21:57:51

Peter Geoghegan <pg@heroku.com> wrote:

> I think the fact that no MERGE implementation does what you want
> should be convincing. It is *horrifically* complicated to make
> what you want work, if indeed it is technically feasible at all.
> Isn't this already complicated enough?

What about the MERGE syntax I posted makes it hard to implement the
statement validation and execution code you already have?  (I'm
asking about for the UPSERT case only, not an implementation of all
aspects of the standard syntax.)

To recap, in summary that would be:
 MERGE INTO tablename [ alias ]   USING ( relation ) [ alias ]   ON ( boolean-expression )   WHEN MATCHED THEN
UPDATESET target-column = expression            [ , target-column = expression ] ...   WHEN NOT MATCHED THEN     INSERT
(target-columns ) VALUES ( expressions )

The initial implementation could restrict to these exact clauses
and require that the boolean-expression used equality-quals on all
columns of a unique index on only NOT NULL columns.  I think the
relation could be a VALUES clause or any SELECT statement without
causing problems; do you think that would need to be constrained in
some way?  It would be wonderful if the expressions could be any
arbitrary expressions assignable to the target columns; do you see
a need to constrain that?

If we later expand the MERGE statement to more general cases, I
don't see why statements of this form could not be treated as a
special case.  Personally, I'm dubious that we would want to
compromise transactional integrity to achieve the broader case, but
doubt that we would need to do so.  I won't say it is just a SMOP,
because there would need to be some careful design first.  ;-)

> We use different access techniques as you say. We do not use
> different types of snapshots. That seems like a pretty
> fundamental distinction.

We use special types of snapshots in running DML that fires certain
types of constraints, like FKs.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Andres Freund

Date:

29 September 2014, 22:02:45

On 2014-09-29 14:57:45 -0700, Kevin Grittner wrote:
> Peter Geoghegan <pg@heroku.com> wrote:
> 
> > I think the fact that no MERGE implementation does what you want
> > should be convincing. It is *horrifically* complicated to make
> > what you want work, if indeed it is technically feasible at all.
> > Isn't this already complicated enough?
> 
> What about the MERGE syntax I posted makes it hard to implement the
> statement validation and execution code you already have?  (I'm
> asking about for the UPSERT case only, not an implementation of all
> aspects of the standard syntax.)

> To recap, in summary that would be:
> 
>   MERGE INTO tablename [ alias ]
>     USING ( relation ) [ alias ]
>     ON ( boolean-expression )
>     WHEN MATCHED THEN
>       UPDATE SET target-column = expression
>              [ , target-column = expression ] ...
>     WHEN NOT MATCHED THEN
>       INSERT ( target-columns ) VALUES ( expressions )
> 
> The initial implementation could restrict to these exact clauses
> and require that the boolean-expression used equality-quals on all
> columns of a unique index on only NOT NULL columns.

That'll make it really hard to actually implement real MERGE.

Because suddenly there's no way for the user to know whether he's
written a ON condition that can implement UPSERT like properties
(i.e. the *precise* column list of an index) or not.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

29 September 2014, 22:05:57

On Mon, Sep 29, 2014 at 3:02 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> That'll make it really hard to actually implement real MERGE.
>
> Because suddenly there's no way for the user to know whether he's
> written a ON condition that can implement UPSERT like properties
> (i.e. the *precise* column list of an index) or not.

Exactly. The difficulty isn't doing what Kevin says so much as doing
so and then at a later date taking that thing and making it into a
fully featured MERGE. We'll be painted into a corner. That's bad,
because as I've said I think we need MERGE too (just far less
urgently).

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Kevin Grittner

Date:

29 September 2014, 22:08:44

Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-09-29 14:57:45 -0700, Kevin Grittner wrote:

>> The initial implementation could restrict to these exact clauses
>> and require that the boolean-expression used equality-quals on all
>> columns of a unique index on only NOT NULL columns.
>
> That'll make it really hard to actually implement real MERGE.
>
> Because suddenly there's no way for the user to know whether he's
> written a ON condition that can implement UPSERT like properties
> (i.e. the *precise* column list of an index) or not.

Well, unless we abandon transactional semantics for other MERGE
statements, we should have a way that UPSERT logic continues to
work if you don't match a suitable index; it will just be slower --
potentially a lot slower, but that's what indexes are for.  I don't
think we need a separate statement type for the one we "do well",
because I don't think we should do the other one without proper
transactional semantics.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Andres Freund

Date:

29 September 2014, 22:10:07

On 2014-09-29 15:08:36 -0700, Kevin Grittner wrote:
> Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2014-09-29 14:57:45 -0700, Kevin Grittner wrote:
> 
> >> The initial implementation could restrict to these exact clauses
> >> and require that the boolean-expression used equality-quals on all
> >> columns of a unique index on only NOT NULL columns.
> >
> > That'll make it really hard to actually implement real MERGE.
> >
> > Because suddenly there's no way for the user to know whether he's
> > written a ON condition that can implement UPSERT like properties
> > (i.e. the *precise* column list of an index) or not.
> 
> Well, unless we abandon transactional semantics for other MERGE
> statements, we should have a way that UPSERT logic continues to
> work if you don't match a suitable index; it will just be slower --
> potentially a lot slower, but that's what indexes are for.  I don't
> think we need a separate statement type for the one we "do well",
> because I don't think we should do the other one without proper
> transactional semantics.

Wrong. You can't realistically implement the guarantees of UPSERT
without a corresponding UNIQUE index.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Kevin Grittner

Date:

29 September 2014, 22:16:55

Andres Freund <andres@2ndquadrant.com> wrote:

> Wrong. You can't realistically implement the guarantees of UPSERT
> without a corresponding UNIQUE index.

You definitely can do it; the question is what you consider
reasonable in terms of development effort, performance, and
concurrency.  I think the problem can be solved with non-scary
values of pretty much any two of those.  I guess my assumption is
that we won't handle the general case until someone wants to put
the substantial development effort into making the other two
acceptable.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Andres Freund

Date:

29 September 2014, 22:20:44

On 2014-09-29 15:16:49 -0700, Kevin Grittner wrote:
> Andres Freund <andres@2ndquadrant.com> wrote:
> 
> > Wrong. You can't realistically implement the guarantees of UPSERT
> > without a corresponding UNIQUE index.
> 
> You definitely can do it; the question is what you consider
> reasonable in terms of development effort, performance, and
> concurrency.

Right. You can exclusively lock the table and such. The point is just
that nobody wants that. I.e. people want to be warned about it.

> I think the problem can be solved with non-scary values of pretty much
> any two of those.  I guess my assumption is that we won't handle the
> general case until someone wants to put the substantial development
> effort into making the other two acceptable.

Which would be a major loss because MERGE is rather useful outside of
atomic upsert.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

29 September 2014, 22:21:37

On Mon, Sep 29, 2014 at 3:08 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
> Well, unless we abandon transactional semantics for other MERGE
> statements, we should have a way that UPSERT logic continues to
> work if you don't match a suitable index; it will just be slower --
> potentially a lot slower, but that's what indexes are for.

I want an implementation that doesn't have unique violations,
unprincipled deadlocks, or serialization failures at READ COMMITTED. I
want it because that's what  the majority of users actually want. It
requires no theoretical justification.

> I don't
> think we need a separate statement type for the one we "do well",
> because I don't think we should do the other one without proper
> transactional semantics.

That seems like a very impractical attitude. I cannot simulate what
I've been doing with unique indexes without taking an exclusive table
lock. That is a major footgun, so it isn't going to happen.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Craig Ringer

Date:

30 September 2014, 02:29:15

On 09/30/2014 01:59 AM, Peter Geoghegan wrote:
> On Mon, Sep 29, 2014 at 7:21 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> If you were an ORM developer reading the PostgreSQL Release Notes for
>> 9.5, which URL would you visit to see a complete description of the
>> new feature, including how it works concurrently, locking and other
>> aspects. How would you check whether some strange behaviour was a bug,
>> or intentional?
> 
> We don't do that with UPDATE, so why would we do it with this? There
> is an existing structure to the documentation that needs to be
> respected. 

I tend to agree, so long as there are appropriate cross-references.

See, for example, how window function information was added.

>This is the case even though the EvalPlanQual() mechanism
> is a total Postgres-ism, which can potentially violate snapshot
> isolation (this is not true of Oracle's READ COMMITTED, for example).

That's useful to know, and certainly worth covering in the isolation
portion of the docs.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Craig Ringer

Date:

30 September 2014, 02:49:01

On 09/30/2014 05:57 AM, Kevin Grittner wrote:
> Peter Geoghegan <pg@heroku.com> wrote:
> 
>> I think the fact that no MERGE implementation does what you want
>> should be convincing. It is *horrifically* complicated to make
>> what you want work, if indeed it is technically feasible at all.
>> Isn't this already complicated enough?
> 
> What about the MERGE syntax I posted makes it hard to implement the
> statement validation and execution code you already have?  (I'm
> asking about for the UPSERT case only, not an implementation of all
> aspects of the standard syntax.)

As I understand it, it isn't the syntax that's hard, it's the logic
behind it.

FWIW I'm pretty persuaded by the argument that:

* Other RDBMSes's MERGE implementations don't behave this way;

* MERGE is a join-based operation, it's not really the same as an upsert (though a join on a values-list is
similar-ish);

* Making MERGE work for the concurrency-safe upsert case would render it harder to then support the rest of MERGE for
theOLAP/data merging cases it's really specified for.

I also have a serious usability concern about re-purposing MERGE for
this. I think it'll be confusing to have a MERGE that's usable as a
concurrency-safe upsert and also as a non-concurrency-safe data merging
operation with slightly different options.

Borrowing from / closely following the MERGE syntax likely makes sense,
but special-casing a subset of MERGE would IMO be a regrettable
long-term decision.

> If we later expand the MERGE statement to more general cases, I
> don't see why statements of this form could not be treated as a
> special case.

Please, no.

That's basically having two different kinds of statement with subtly
different syntax differentiating them.

Upsert is full of confusing and subtle behaviour. Any implementation
needs to focus on making it easy to get right, and I don't think having
something where small syntax variations can cause you to silently trip
out of the concurrency-safe mode of operation would meet that need.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Craig Ringer

Date:

30 September 2014, 02:53:19

On 09/30/2014 06:08 AM, Kevin Grittner wrote:
> Well, unless we abandon transactional semantics for other MERGE
> statements, we should have a way that UPSERT logic continues to
> work if you don't match a suitable index; it will just be slower --
> potentially a lot slower, but that's what indexes are for.

That would probably lead to MERGE taking different lock strengths based
on index availability, having different failure modes, etc.

The less internal magic inside what's already a complicated and
confusing area for users, the better.

> I don't
> think we need a separate statement type for the one we "do well",
> because I don't think we should do the other one without proper
> transactional semantics.

"Proper transactional semantics" isn't the same as "free from all forms
of race condition".

Sometimes you want or need to do things that can't be made
concurrency-safe, or would perform unacceptably if done in a
concurrency-safe manner. That's why we have LOCK TABLE, among other things.

We have READ COMMITTED for a reason. We have SELECT without FOR SHARE
for a reason.

MERGE seems to be specified as more of an OLAP / ETL operation than an
OLTP one, and I think we should probably respect that - and the way
other RDBMSes have already implemented it.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Kevin Grittner

Date:

30 September 2014, 14:16:03

Peter Geoghegan <pg@heroku.com> wrote:
> On Mon, Sep 29, 2014 at 3:08 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
>> Well, unless we abandon transactional semantics for other MERGE
>> statements, we should have a way that UPSERT logic continues to
>> work if you don't match a suitable index; it will just be slower --
>> potentially a lot slower, but that's what indexes are for.
>
> I want an implementation that doesn't have unique violations,
> unprincipled deadlocks, or serialization failures at READ COMMITTED. I
> want it because that's what  the majority of users actually want. It
> requires no theoretical justification.

Sure.  I'm not suggesting otherwise.

>> I don't think we need a separate statement type for the one we
>> "do well", because I don't think we should do the other one
>> without proper transactional semantics.
>
> That seems like a very impractical attitude. I cannot simulate what
> I've been doing with unique indexes without taking an exclusive table
> lock. That is a major footgun, so it isn't going to happen.

There are certainly other ways to do it, although they require more
work.  As far as UPSERT goes, I agree that we should require such
an index, at least for the initial implementation and into the
foreseeable future.  What I'm saying is that if we implement it
using the standard MERGE syntax, then if the features of MERGE are
extended it will continue to work even in the absence of such an
index.  The index becomes a way of optimizing access rather than
defining what access is allowed.

At the risk of pushing people away from this POV, I'll point out
that this is somewhat similar to what we do for unlogged bulk loads
-- if all the conditions for doing it the fast way are present, we
do it the fast way; otherwise it still works, but slower.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Simon Riggs

Date:

30 September 2014, 15:31:14

On 29 September 2014 18:59, Peter Geoghegan <pg@heroku.com> wrote:
> On Mon, Sep 29, 2014 at 7:21 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> If you were an ORM developer reading the PostgreSQL Release Notes for
>> 9.5, which URL would you visit to see a complete description of the
>> new feature, including how it works concurrently, locking and other
>> aspects. How would you check whether some strange behaviour was a bug,
>> or intentional?
>
> We don't do that with UPDATE, so why would we do it with this?

Because this is new, harder and non-standard, so there is no other
place to look. If you want to persuade us that MERGE has poorly
defined concurrency, so you have implemented a new command, the new
command had better have very well defined behaviour.

And because a reviewer asked for it?

For example, this patch for UPSERT doesn't support updatable views.
But I can't see anyone that didn't read the patch would know that.

>>> All of these were added. There are two new sets of isolation tests,
>>> one per variant of the new clause (IGNORE/UPDATE).
>>
>> When you say "added", what do you mean? You posted one new doc patch,
>> with no tests in it.
>
> I mean that there was a commit (not included with the documentation,
> but with the original patchset) with many tests. I don't know why
> you're suggesting that I don't have "concurrency tests". There are
> isolation tests in that commit. There are also many regression tests.

I see the tests in earlier patches; I was observing there are no new ones.

There are no tests for the use of CONFLICTING() syntax
No tests for interaction with triggers, with regard to before triggers
changing values prior to conflict detection.

My hope was that the complex behaviour of multiple unique indexes
might be explained there. Forgive me, I didn't see it.

>>>> No explanation of why the CONFLICTING() syntax differs from OLD./NEW.
>>>> syntax used in triggers
>>>
>>> Why should it be the same?
>>
>> Because it would be a principled approach to do that.
>
> That is just an assertion. The MERGE syntax doesn't use that either.

MERGE allows "AS row" which then allow you to refer to row.x for
column x of the input.

Other people have independently commented the same thing.

>> If we aren't going to use MERGE syntax, it would make sense to at
>> least use the same terminology.
>>
>> e.g.
>> INSERT ....
>> WHEN MATCHED
>> UPDATE
>>
>> The concept of "matched" is identical between MERGE and UPSERT and it
>> will be confusing to have two words for the same thing.
>
> I don't care if we change the spelling to "WHEN MATCHED
> UPDATE/IGNORE". That seems fine. But MERGE is talking about a join,
> not the presence of a would-be duplicate violation.

I don't understand that comment.

>> There seems to be a good reason not to use the MySQL syntax of ON
>> DUPLICATE KEY UPDATE, which doesn't allow you to specify UPDATE
>> operations other than a replace, so no deltas, e.g. SET a = a + x
>
> That isn't true, actually. It clearly does.

It does. Rather amusingly I misread the very unclear MySQL docs.

>> Having said that, it would be much nicer to have a mode that allows
>> you to just say the word "UPDATE" and have it copy the data into the
>> correct columns, like MySQL does. That is very intuitive, even if it
>> isn't very flexible.
>
> Multi-assignment updates (with or without CONFLICTING()) are supported, FWIW.

If I want the incoming row to overwrite the old row, it would be good
to have syntax to support that easily.

Why doesn't
INSERT INTO UNIQUE_TBL VALUES (1, 'a'), (2, 'b'), (2, 'b')  ON
CONFLICT UPDATE SET t = 'fails';
end up with this in the table?

1   a
2   fails

What happens with this?

BEGIN;
INSERT INTO UNIQUE_TBL VALUES (2, 'b')  ON CONFLICT UPDATE SET t = 'fails';
INSERT INTO UNIQUE_TBL VALUES (2, 'b')  ON CONFLICT UPDATE SET t = 'fails';
COMMIT;

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

30 September 2014, 17:32:17

On Tue, Sep 30, 2014 at 8:30 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>>>> No explanation of why the CONFLICTING() syntax differs from OLD./NEW.
>>>>> syntax used in triggers
>>>>
>>>> Why should it be the same?
>>>
>>> Because it would be a principled approach to do that.
>>
>> That is just an assertion. The MERGE syntax doesn't use that either.
>
> MERGE allows "AS row" which then allow you to refer to row.x for
> column x of the input.

It does, but that isn't what you suggested. You talked about the
OLD.*/NEW.* syntax.

>> I don't care if we change the spelling to "WHEN MATCHED
>> UPDATE/IGNORE". That seems fine. But MERGE is talking about a join,
>> not the presence of a would-be duplicate violation.
>
> I don't understand that comment.

I just mean that if you want to replace ON CONFLICT UPDATE with WHEN
MATCHED UPDATE - that little part of the grammar - that seems okay.

>> Multi-assignment updates (with or without CONFLICTING()) are supported, FWIW.
>
> If I want the incoming row to overwrite the old row, it would be good
> to have syntax to support that easily.

Well, maybe I'll get around to that when things settle down. That's
clearly in the realm of "nice to have", though.

> Why doesn't
> INSERT INTO UNIQUE_TBL VALUES (1, 'a'), (2, 'b'), (2, 'b')  ON
> CONFLICT UPDATE SET t = 'fails';
> end up with this in the table?
>
> 1   a
> 2   fails

A "cardinality violation" - just like MERGE. As with MERGE, the final
value of every row needs to be deterministic (within the command).

> What happens with this?
>
> BEGIN;
> INSERT INTO UNIQUE_TBL VALUES (2, 'b')  ON CONFLICT UPDATE SET t = 'fails';
> INSERT INTO UNIQUE_TBL VALUES (2, 'b')  ON CONFLICT UPDATE SET t = 'fails';
> COMMIT;

It works fine. No cardinality violation with two separate commands.
See the new ExecLockUpdateTuple() function within nodeModifyTable.c
for extensive discussion on how this is handled.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

30 September 2014, 18:20:34

On Tue, Sep 30, 2014 at 8:30 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 29 September 2014 18:59, Peter Geoghegan <pg@heroku.com> wrote:
>> On Mon, Sep 29, 2014 at 7:21 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> If you were an ORM developer reading the PostgreSQL Release Notes for
>>> 9.5, which URL would you visit to see a complete description of the
>>> new feature, including how it works concurrently, locking and other
>>> aspects. How would you check whether some strange behaviour was a bug,
>>> or intentional?
>>
>> We don't do that with UPDATE, so why would we do it with this?
>
> Because this is new, harder and non-standard, so there is no other
> place to look. If you want to persuade us that MERGE has poorly
> defined concurrency, so you have implemented a new command, the new
> command had better have very well defined behaviour.

I'm making a point about the structure of the docs here. The behavior
*is* documented, just not in the INSERT documentation, a situation
I've compare with how EvalPlanQual() isn't discussed in the
UPDATE/DELETE/SELECT FOR UPDATE docs. And EvalPlanQual() has some
pretty surprising corner-case behaviors.

That having been said, maybe I could have gone into more detail on the
"consensus among unique indexes" thing in another part of the
documentation, since that isn't separately covered (only the aspects
of when the predicate is evaluated in READ COMMITTED mode and other
things like that were covered).

> For example, this patch for UPSERT doesn't support updatable views.
> But I can't see anyone that didn't read the patch would know that.

By reading the CREATE VIEW docs. Maybe there could stand to be a
compatibility note in the main INSERT command, but I didn't want to do
that as long as things were up in the air. It might be the case that
we figure out good behavior for updatable views.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Josh Berkus

Date:

30 September 2014, 18:49:35

On 09/30/2014 11:20 AM, Peter Geoghegan wrote:
>> > For example, this patch for UPSERT doesn't support updatable views.
>> > But I can't see anyone that didn't read the patch would know that.
> By reading the CREATE VIEW docs. Maybe there could stand to be a
> compatibility note in the main INSERT command, but I didn't want to do
> that as long as things were up in the air. It might be the case that
> we figure out good behavior for updatable views.

All of these things sound like good ideas for documentation
improvements, but hardly anything which should block the patch.  It has
documentation, more than we'd require for a lot of other patches, and
it's not like the 9.5 release is next month.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Andres Freund

Date:

30 September 2014, 18:52:09

On 2014-09-30 11:49:21 -0700, Josh Berkus wrote:
> On 09/30/2014 11:20 AM, Peter Geoghegan wrote:
> >> > For example, this patch for UPSERT doesn't support updatable views.
> >> > But I can't see anyone that didn't read the patch would know that.
> > By reading the CREATE VIEW docs. Maybe there could stand to be a
> > compatibility note in the main INSERT command, but I didn't want to do
> > that as long as things were up in the air. It might be the case that
> > we figure out good behavior for updatable views.
> 
> All of these things sound like good ideas for documentation
> improvements, but hardly anything which should block the patch.  It has
> documentation, more than we'd require for a lot of other patches, and
> it's not like the 9.5 release is next month.

What's blocking it is that (afaik) no committer agrees with the approach
taken to solve the concurrency problems. And several (Heikki, Robert,
me) have stated their dislike of the proposed approach.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Josh Berkus

Date:

30 September 2014, 18:55:51

On 09/30/2014 11:51 AM, Andres Freund wrote:
>> All of these things sound like good ideas for documentation
>> > improvements, but hardly anything which should block the patch.  It has
>> > documentation, more than we'd require for a lot of other patches, and
>> > it's not like the 9.5 release is next month.
> What's blocking it is that (afaik) no committer agrees with the approach
> taken to solve the concurrency problems. And several (Heikki, Robert,
> me) have stated their dislike of the proposed approach.

If that's what's blocking it then fine.  But if we might change the
concurrency approach, then what's the point in quibbling about docs?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Josh Berkus

Date:

30 September 2014, 18:59:12

On 09/30/2014 07:15 AM, Kevin Grittner wrote:
> There are certainly other ways to do it, although they require more
> work.  As far as UPSERT goes, I agree that we should require such
> an index, at least for the initial implementation and into the
> foreseeable future.  What I'm saying is that if we implement it
> using the standard MERGE syntax, then if the features of MERGE are
> extended it will continue to work even in the absence of such an
> index.  The index becomes a way of optimizing access rather than
> defining what access is allowed.
> 
> At the risk of pushing people away from this POV, I'll point out
> that this is somewhat similar to what we do for unlogged bulk loads
> -- if all the conditions for doing it the fast way are present, we
> do it the fast way; otherwise it still works, but slower.

Except that switching between fast/slow bulk loads affects *only* the
speed of loading, not the locking rules.  Having a statement silently
take a full table lock when we were expecting it to be concurrent
(because, for example, the index got rebuilt and someone forgot the
UNIQUE) violates POLA from my perspective.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

30 September 2014, 19:05:58

On Tue, Sep 30, 2014 at 11:51 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> What's blocking it is that (afaik) no committer agrees with the approach
> taken to solve the concurrency problems. And several (Heikki, Robert,
> me) have stated their dislike of the proposed approach.

Well, it depends on what you mean by "approach to concurrency
problems". It's not as if a consensus has emerged in favor of another
approach, and if there is to be another approach, the details need to
be worked out ASAP. Even still, I would appreciate it if people could
review the patch on the assumption that those issues will be worked
out. After all, there are plenty of other parts to this that have
nothing to do with value locking - the entire "top half", which has
significant subtleties (some involving concurrency) in its own right,
reasonably well encapsulated from value locking. A couple of weeks
ago, I felt good about the fact that it seemed "time was on my side"
9.5-wise, but maybe that isn't true. Working through the community
process for this patch is going to be very difficult.

I think everyone understands that there could be several ways of
implementing value locking. I really do think it's a well encapsulated
aspect of the patch, though, so even if you hate how I've implemented
value locking, please try and give feedback on everything else. Simon
wanted to start with the user-visible semantics, which makes sense,
but I see no reason to limit it to that.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Andres Freund

Date:

30 September 2014, 21:15:48

On 2014-09-30 12:05:46 -0700, Peter Geoghegan wrote:
> On Tue, Sep 30, 2014 at 11:51 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> > What's blocking it is that (afaik) no committer agrees with the approach
> > taken to solve the concurrency problems. And several (Heikki, Robert,
> > me) have stated their dislike of the proposed approach.
> 
> Well, it depends on what you mean by "approach to concurrency
> problems". It's not as if a consensus has emerged in favor of another
> approach, and if there is to be another approach, the details need to
> be worked out ASAP.

Well. People have given you outlines of approaches. And Heikki even gave
you a somewhat working prototype. I don't think you can fairly expect
more.

> Even still, I would appreciate it if people could
> review the patch on the assumption that those issues will be worked
> out.

Right now I don't really see the point. You've so far shown no
inclination to accept significant concerns about your approach. And
without an agreement about how to solve the concurrency issues the
feature is dead in the water. And thus time spent reviewing isn't well
spent.

I'm pretty sure I'm not the only one feeling that way at this point.

> A couple of weeks
> ago, I felt good about the fact that it seemed "time was on my side"
> 9.5-wise, but maybe that isn't true. Working through the community
> process for this patch is going to be very difficult.

The community process involves accepting that your opinion isn't the
community's. Believe me, I learned that the hard way.

It's one thing to argue about the implementation of a feature for a week
or four. Or even insist that you're right in some implementation detail
local to your new code. But you've not moved one jota in the critical
parts that affect large parts of the system in half a year.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Kevin Grittner

Date:

30 September 2014, 21:39:52

Josh Berkus <josh@agliodbs.com> wrote:
> On 09/30/2014 07:15 AM, Kevin Grittner wrote:

>> At the risk of pushing people away from this POV, I'll point out
>> that this is somewhat similar to what we do for unlogged bulk loads
>> -- if all the conditions for doing it the fast way are present, we
>> do it the fast way; otherwise it still works, but slower.
>
> Except that switching between fast/slow bulk loads affects *only* the
> speed of loading, not the locking rules.  Having a statement silently
> take a full table lock when we were expecting it to be concurrent
> (because, for example, the index got rebuilt and someone forgot the
> UNIQUE) violates POLA from my perspective.

I would not think that an approach which took a full table lock to
implement the more general case would be accepted.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Josh Berkus

Date:

30 September 2014, 21:44:17

On 09/30/2014 02:39 PM, Kevin Grittner wrote:
> Josh Berkus <josh@agliodbs.com> wrote:
>> On 09/30/2014 07:15 AM, Kevin Grittner wrote:
> 
>>> At the risk of pushing people away from this POV, I'll point out
>>> that this is somewhat similar to what we do for unlogged bulk loads
>>> -- if all the conditions for doing it the fast way are present, we
>>> do it the fast way; otherwise it still works, but slower.
>>
>> Except that switching between fast/slow bulk loads affects *only* the
>> speed of loading, not the locking rules.  Having a statement silently
>> take a full table lock when we were expecting it to be concurrent
>> (because, for example, the index got rebuilt and someone forgot the
>> UNIQUE) violates POLA from my perspective.
> 
> I would not think that an approach which took a full table lock to
> implement the more general case would be accepted.

Why not?  There are certainly cases ... like bulk loading ... where
users would find it completely acceptable.  Imagine that you're merging
3 files into a single unlogged table before processing them into
finished data.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Kevin Grittner

Date:

30 September 2014, 21:52:07

Josh Berkus <josh@agliodbs.com> wrote:
> On 09/30/2014 02:39 PM, Kevin Grittner wrote:
>> Josh Berkus <josh@agliodbs.com> wrote:
>>> On 09/30/2014 07:15 AM, Kevin Grittner wrote:
>>>
>>>> At the risk of pushing people away from this POV, I'll point out
>>>> that this is somewhat similar to what we do for unlogged bulk loads
>>>> -- if all the conditions for doing it the fast way are present, we
>>>> do it the fast way; otherwise it still works, but slower.
>>>
>>> Except that switching between fast/slow bulk loads affects *only* the
>>> speed of loading, not the locking rules.  Having a statement silently
>>> take a full table lock when we were expecting it to be concurrent
>>> (because, for example, the index got rebuilt and someone forgot the
>>> UNIQUE) violates POLA from my perspective.
>>
>> I would not think that an approach which took a full table lock to
>> implement the more general case would be accepted.
>
> Why not?  There are certainly cases ... like bulk loading ... where
> users would find it completely acceptable.  Imagine that you're merging
> 3 files into a single unlogged table before processing them into
> finished data.

So the expectation is that when we implement MERGE it will, by
default, take out an EXCLUSIVE lock for the entire target table for
the entire duration of the command?  I would have expected a bit
more finesse.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Josh Berkus

Date:

30 September 2014, 21:57:52

On 09/30/2014 02:51 PM, Kevin Grittner wrote:
> Josh Berkus <josh@agliodbs.com> wrote:
>> On 09/30/2014 02:39 PM, Kevin Grittner wrote:
>>> Josh Berkus <josh@agliodbs.com> wrote:
>>>> On 09/30/2014 07:15 AM, Kevin Grittner wrote:
>>>>
>>>>> At the risk of pushing people away from this POV, I'll point out
>>>>> that this is somewhat similar to what we do for unlogged bulk loads
>>>>> -- if all the conditions for doing it the fast way are present, we
>>>>> do it the fast way; otherwise it still works, but slower.
>>>>
>>>> Except that switching between fast/slow bulk loads affects *only* the
>>>> speed of loading, not the locking rules.  Having a statement silently
>>>> take a full table lock when we were expecting it to be concurrent
>>>> (because, for example, the index got rebuilt and someone forgot the
>>>> UNIQUE) violates POLA from my perspective.
>>>
>>> I would not think that an approach which took a full table lock to
>>> implement the more general case would be accepted.
>>
>> Why not?  There are certainly cases ... like bulk loading ... where
>> users would find it completely acceptable.  Imagine that you're merging
>> 3 files into a single unlogged table before processing them into
>> finished data.
> 
> So the expectation is that when we implement MERGE it will, by
> default, take out an EXCLUSIVE lock for the entire target table for
> the entire duration of the command?  I would have expected a bit
> more finesse.

I don't know that that is the *expectation*.  However, I personally
would find it *acceptable* if it meant that we could get efficient merge
semantics on other aspects of the syntax, since my primary use for MERGE
is bulk loading.

Regardless, I don't think there's any theoretical way to support UPSERT
without a unique constraint.  Therefore eventual support of this would
require a full table lock.  Therefore having it use the same command as
UPSERT with a unique constraint is a bit of a booby trap for users.
This is a lot like the "ADD COLUMN with a default rewrites the whole
table" booby trap which hundreds of our users complain about every
month.  We don't want to add more such unexpected consequences for users.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Andres Freund

Date:

30 September 2014, 22:02:04

On 2014-09-30 14:51:57 -0700, Kevin Grittner wrote:
> Josh Berkus <josh@agliodbs.com> wrote:
> > On 09/30/2014 02:39 PM, Kevin Grittner wrote:
> >> Josh Berkus <josh@agliodbs.com> wrote:
> >>> On 09/30/2014 07:15 AM, Kevin Grittner wrote:
> >>>
> >>>> At the risk of pushing people away from this POV, I'll point out
> >>>> that this is somewhat similar to what we do for unlogged bulk loads
> >>>> -- if all the conditions for doing it the fast way are present, we
> >>>> do it the fast way; otherwise it still works, but slower.
> >>>
> >>> Except that switching between fast/slow bulk loads affects *only* the
> >>> speed of loading, not the locking rules.  Having a statement silently
> >>> take a full table lock when we were expecting it to be concurrent
> >>> (because, for example, the index got rebuilt and someone forgot the
> >>> UNIQUE) violates POLA from my perspective.
> >>
> >> I would not think that an approach which took a full table lock to
> >> implement the more general case would be accepted.
> >
> > Why not?  There are certainly cases ... like bulk loading ... where
> > users would find it completely acceptable.  Imagine that you're merging
> > 3 files into a single unlogged table before processing them into
> > finished data.
> 
> So the expectation is that when we implement MERGE it will, by
> default, take out an EXCLUSIVE lock for the entire target table for
> the entire duration of the command?  I would have expected a bit
> more finesse.

I think it'd be acceptable. Alternatively we'll just accept that you can
get uniqueness violations under concurrency. I many cases that'll be
fine.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

30 September 2014, 22:02:28

On Tue, Sep 30, 2014 at 3:01 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> I think it'd be acceptable. Alternatively we'll just accept that you can
> get uniqueness violations under concurrency. I many cases that'll be
> fine.


I think living with unique violations is the right thing with MERGE, fwiw.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Andres Freund

Date:

30 September 2014, 22:03:38

On 2014-09-30 14:57:43 -0700, Josh Berkus wrote:
> Regardless, I don't think there's any theoretical way to support UPSERT
> without a unique constraint.

You can do stuff like blocking predicate locking. But without indexes to
support it that gets awfully complicated and unfunny. I don't think we
want to go there. So essentially I agree with that statement.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Kevin Grittner

Date:

30 September 2014, 22:22:53

Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-09-30 14:57:43 -0700, Josh Berkus wrote:
>
>> Regardless, I don't think there's any theoretical way to support
>> UPSERT without a unique constraint.
>
> You can do stuff like blocking predicate locking. But without indexes to
> support it that gets awfully complicated and unfunny. I don't think we
> want to go there. So essentially I agree with that statement.

Well, as you seem to be saying, it's not to bad with even an
non-unique index if we wanted to do a little extra work; and there
are a lot of ways to potentially deal with it even without that.
Theoretically, the number of ways to do this is limited only by
time available to brainstorm.

That said, at no time have I advocated that we try to implement
UPSERT in this release with anything but a UNIQUE index.  The issue
I raised was whether a subset of the MERGE syntax should be used to
specify UPSERT rather than inventing our own syntax -- which
doesn't seem in any way incompatible requiring a unique index to
match the expression.  Given subsequent discussion, perhaps we
could decorate it with something to indicate which manner of
concurrency handling is desired?  Techniques discussed so far are
- UPSERT style- Hold an EXCLUSIVE lock on the table- Allow "native" concurrency management

An alternative which seems to be on some people's minds is to use a
different command name for the first option (but why not keep the
rest of the standard syntax?) and to require an explicit LOCK TABLE
statement at the start of the transaction if you want the second
option.

My preference, after this discussion, would be to default to UPSERT
style if the appropriate conditions are met, and to default to the
third option otherwise.  If you want an exclusive lock, ask for it
with the LOCK TABLE statement.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

30 September 2014, 22:24:19

On Tue, Sep 30, 2014 at 2:15 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> Well. People have given you outlines of approaches. And Heikki even gave
> you a somewhat working prototype. I don't think you can fairly expect
> more.

I don't expect anything, really. I asked nicely - that's all. I don't
know why there is so much discussion of what I expect or don't expect.
Things don't work around here by everyone doing only strictly what
they're obligated to do. Everyone is strictly obligated to do nothing,
when you get right down to it.

>> Even still, I would appreciate it if people could
>> review the patch on the assumption that those issues will be worked
>> out.
>
> Right now I don't really see the point. You've so far shown no
> inclination to accept significant concerns about your approach. And
> without an agreement about how to solve the concurrency issues the
> feature is dead in the water. And thus time spent reviewing isn't well
> spent.
>
> I'm pretty sure I'm not the only one feeling that way at this point.

I think that's *incredibly* unfair. There appears to be broad
acceptance of the problems around deadlocking as a result of my work
with Heikki. That was a major step forward. Now we all agree on the
parameters of the discussion around value locking, AFAICT. There is an
actual way forward, and not total quagmire -- great. I had to dig my
heals in to win that much, and it wasn't easy. I accept that it
probably wasn't easy for other people either, and I am thankful for
the effort of other people, particularly Heikki, but also you.

>> A couple of weeks
>> ago, I felt good about the fact that it seemed "time was on my side"
>> 9.5-wise, but maybe that isn't true. Working through the community
>> process for this patch is going to be very difficult.
>
> The community process involves accepting that your opinion isn't the
> community's. Believe me, I learned that the hard way.

The community doesn't have a worked-out opinion on this either way.
Arguably, what you and Simon want to do is closer than what I want to
do than what Heikki wants to do - you're still talking about adding
locks that are tied to AMs in a fairly fundamental way. But, FWIW, I'd
sooner take Heikki's approach than insert promise tuples into indexes
directly. I think that Heikki's approach is better.

In all honesty, I don't care who "wins", as long as someone does and
we get the feature in shape. No one can "win" if all sides are not
realistic about the problems. The issues that I've called out about
what Heikki has suggested are quite significant issues. Can't we talk
about them? Or am I required to polish-up Heikki's approach, and
present it at a commitfest, only to have somebody point out the same
issues then? I am *not* nitpicking, and the issues are of fundamental
importance. Look at the issues I raise and you'll see that's the case.

My pointing out of these issues is not some artifice to "win" the
argument. I don't appreciate the insinuation that it is. I am
completely undeserving of that sort of mistrust. It's insulting. And
it's also a total misrepresentation to suggest it's me versus you,
Heikki, Robert, and Simon. Opinion is far more divided than you let
on, since what you and Simon suggest is far different to what Heikki
suggests. Let's figure out a way to reach agreement.

> It's one thing to argue about the implementation of a feature for a week
> or four. Or even insist that you're right in some implementation detail
> local to your new code. But you've not moved one jota in the critical
> parts that affect large parts of the system in half a year.

You're right. I haven't moved one bit on that. But, on the other hand,
I haven't doubled down on the approach either - I have done very
little on it, and have given it relatively little thought either way.
I preferred to focus my energies on the "top half". Surely you'd agree
that that was the logical course of action to take over the last few
months. I don't know if you noticed, but I presented this whole new
revised version as "this is the thing that gives us the ability to
discuss the fundamental issue of value locking". So my suggestion was
that if you don't want to have that conversation, at least look at the
"top half" a bit.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Simon Riggs

Date:

30 September 2014, 23:28:30

On 30 September 2014 19:49, Josh Berkus <josh@agliodbs.com> wrote:
> On 09/30/2014 11:20 AM, Peter Geoghegan wrote:
>>> > For example, this patch for UPSERT doesn't support updatable views.
>>> > But I can't see anyone that didn't read the patch would know that.
>> By reading the CREATE VIEW docs. Maybe there could stand to be a
>> compatibility note in the main INSERT command, but I didn't want to do
>> that as long as things were up in the air. It might be the case that
>> we figure out good behavior for updatable views.
>
> All of these things sound like good ideas for documentation
> improvements, but hardly anything which should block the patch.  It has
> documentation, more than we'd require for a lot of other patches, and
> it's not like the 9.5 release is next month.

We won't get consensus simply by saying "Would you like a fast upsert
feature?" because everyone says Yes to that.

A clear description of the feature being added is necessary to agree
its acceptance. When we implement a SQL Standard feature, we can just
look in the standard to see how it should work and compare. When we go
off-piste, we need more info to make sure we know what we are getting
as well as why we are not getting something from the Standard.

I have not suggested I would block the patch because it doesn't have
docs. I have pointed out that the lack of consensus about the patch is
because nobody knows what it contains, which others agreed with. My
request was, and is, a proposed mechanism to *unblock* a very
obviously stalled patch.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

30 September 2014, 23:39:07

On Tue, Sep 30, 2014 at 4:28 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> A clear description of the feature being added is necessary to agree
> its acceptance. When we implement a SQL Standard feature, we can just
> look in the standard to see how it should work and compare. When we go
> off-piste, we need more info to make sure we know what we are getting
> as well as why we are not getting something from the Standard.

I think that's fair.

> I have not suggested I would block the patch because it doesn't have
> docs. I have pointed out that the lack of consensus about the patch is
> because nobody knows what it contains, which others agreed with. My
> request was, and is, a proposed mechanism to *unblock* a very
> obviously stalled patch.

Please keep asking questions - it isn't necessarily obvious to me
*what* isn't clear, because of my lack of perspective. That's a useful
role. It occurs to me now that I ought to have found a place to
document "cardinality violations" [1], but I didn't, for example.

[1] http://tracker.firebirdsql.org/browse/CORE-2274
-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Andres Freund

Date:

01 October 2014, 11:30:55

On 2014-09-26 16:19:33 -0700, Peter Geoghegan wrote:
> On Fri, Sep 26, 2014 at 3:25 PM, Peter Geoghegan <pg@heroku.com> wrote:
> > On Fri, Sep 26, 2014 at 3:11 PM, Alvaro Herrera
> > <alvherre@2ndquadrant.com> wrote:
> >> FWIW there are 28 callers of HeapTupleHeaderGetXmin.
> 
> > Don't forget about direct callers to HeapTupleHeaderGetRawXmin(),
> > though. There are plenty of those in tqual.c.
> 
> Which reminds me: commit 37484ad2 added the opportunistic freezing
> stuff. To quote the commit message:
> 
> """
> Instead of changing the tuple xmin to FrozenTransactionId, the combination
> of HEAP_XMIN_COMMITTED and HEAP_XMIN_INVALID, which were previously never
> set together, is now defined as HEAP_XMIN_FROZEN.  A variety of previous
> proposals to freeze tuples opportunistically before vacuum_freeze_min_age
> is reached have foundered on the objection that replacing xmin by
> FrozenTransactionId might hinder debugging efforts when things in this
> area go awry; this patch is intended to solve that problem by keeping
> the XID around (but largely ignoring the value to which it is set).
> 
> """
> 
> Why wouldn't the same objection (the objection that the earlier
> opportunistic freezing ideas stalled on) apply to directly setting
> tuple xmin to InvalidTransactionId?

Because it's pretty much unrelated? The FrozenTransactionId bit you
reference is about tuples that actually survive, which isn't the case
here.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Bruce Momjian

Date:

02 October 2014, 20:10:40

On Tue, Sep 30, 2014 at 02:57:43PM -0700, Josh Berkus wrote:
> I don't know that that is the *expectation*.  However, I personally
> would find it *acceptable* if it meant that we could get efficient merge
> semantics on other aspects of the syntax, since my primary use for MERGE
> is bulk loading.
> 
> Regardless, I don't think there's any theoretical way to support UPSERT
> without a unique constraint.  Therefore eventual support of this would
> require a full table lock.  Therefore having it use the same command as
> UPSERT with a unique constraint is a bit of a booby trap for users.
> This is a lot like the "ADD COLUMN with a default rewrites the whole
> table" booby trap which hundreds of our users complain about every
> month.  We don't want to add more such unexpected consequences for users.

I think if we use the MERGE command for this feature we would need to
use a non-standard keyword to specify that we want OLTP/UPSERT
functionality.  That would allow us to mostly use the MERGE standard
syntax without having surprises about non-standard behavior.  I am
thinking of how CONCURRENTLY changes the behavior of some commands.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

02 October 2014, 21:08:35

On Thu, Oct 2, 2014 at 1:10 PM, Bruce Momjian <bruce@momjian.us> wrote:
> I think if we use the MERGE command for this feature we would need to
> use a non-standard keyword to specify that we want OLTP/UPSERT
> functionality.  That would allow us to mostly use the MERGE standard
> syntax without having surprises about non-standard behavior.  I am
> thinking of how CONCURRENTLY changes the behavior of some commands.

That would leave you without a real general syntax. It'd also make
having certain aspects of an UPSERT more explicit be a harder goal
(there is no conventional join involved here - everything goes through
a unique index). Adding the magic keyword would break certain other
parts of the statement, so you'd have exact rules for what worked
where. I see no advantage, and considerable disadvantages.

Note that I've documented a lot of this stuff here:

https://wiki.postgresql.org/wiki/UPSERT

Mapping the join thing onto which unique index you want to make the
UPSERT target is very messy. There are a lot of corner cases. It's
quite ticklish.

Please add to it if you think we've missed something.
-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Bruce Momjian

Date:

02 October 2014, 21:38:09

On Thu, Oct  2, 2014 at 02:08:30PM -0700, Peter Geoghegan wrote:
> On Thu, Oct 2, 2014 at 1:10 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > I think if we use the MERGE command for this feature we would need to
> > use a non-standard keyword to specify that we want OLTP/UPSERT
> > functionality.  That would allow us to mostly use the MERGE standard
> > syntax without having surprises about non-standard behavior.  I am
> > thinking of how CONCURRENTLY changes the behavior of some commands.
> 
> That would leave you without a real general syntax. It'd also make
> having certain aspects of an UPSERT more explicit be a harder goal
> (there is no conventional join involved here - everything goes through
> a unique index). Adding the magic keyword would break certain other
> parts of the statement, so you'd have exact rules for what worked
> where. I see no advantage, and considerable disadvantages.
> 
> Note that I've documented a lot of this stuff here:
> 
> https://wiki.postgresql.org/wiki/UPSERT
> 
> Mapping the join thing onto which unique index you want to make the
> UPSERT target is very messy. There are a lot of corner cases. It's
> quite ticklish.
> 
> Please add to it if you think we've missed something.

OK, it is was just an idea I wanted to point out, and if it doesn't
work, it more clearly cements that we need UPSERT _and_ MERGE.

Josh was pointing out that we don't want to surprise our users, so I
suggested an additional keyword, which addresses his objections, but as
you said, if that standard MERGE syntax doesn't give us what we want,
then that is the fatal objection to using only MERGE.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Marti Raudsepp

Date:

07 October 2014, 11:28:21

On Thu, Sep 4, 2014 at 12:13 AM, Peter Geoghegan <pg@heroku.com> wrote:
> On Wed, Sep 3, 2014 at 9:51 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> INSERT INTO upsert(key, val) VALUES(1, 'insert') ON CONFLICT WITHIN
>>> upsert_pkey UPDATE SET val = 'update';
>>
>> It seems to me that it would be better to specify a conflicting column
>> set rather than a conflicting index name.
>
> I'm open to pursuing that, provided there is a possible implementation
> that's robust against things like BEFORE triggers that modify
> constrained attributes. It must also work well with partial unique
> indexes. So I imagine we'd have to determine a way of looking up the
> unique index only after BEFORE triggers fire. Unless you're
> comfortable with punting on some of these cases by throwing an error,
> then all of this is actually surprisingly ticklish.

Speaking of this, I really don't like the proposed behavior of firing
BEFORE INSERT triggers even before we've decided whether to insert or
update. In the "classical" upsert pattern, changes by a BEFORE INSERT
trigger would get rolled back on conflict, but the new approach seems
surprising: changes from BEFORE INSERT get persisted in the database,
but AFTER INSERT is not fired.

I haven't found any discussion about alternative triggers semantics
for upsert. If there has been any, can you point me to it?

----
How about this: use the original VALUES results for acquiring a value
lock; if indeed the row didn't conflict, *then* fire BEFORE INSERT
triggers, and throw an error if the trigger changed any columns of the
(specified?) unique key.

Advantages of this approach:
1. Would solve the above conundrum about specifying a unique index via columns.
2. In the UPDATE case we can skip evaluating INSERT triggers and
DEFAULT expressions for columns
3. If I'm not missing anything, this approach may also let us get rid
of the CONFLICTING() construct
4. Possibly be closer to MySQL's syntax?

Point (2) is actually a major consideration IMO: if your query is
mostly performing UPDATEs, on a table with SERIAL keys, and you're
using a different key to perform the updates, then you waste sequence
values unnecessarily. I believe this is a very common pattern, for
example:

create table evt_type (id serial primary key, name text unique, evt_count int);
prepare upsert(text) as INSERT into evt_type (name, evt_count) values ($1, 1)
on conflict within evt_type_name_key UPDATE set evt_count=evt_count+1;

execute upsert('foo');
execute upsert('foo');
execute upsert('bar');

# table evt_type;id | name | evt_count
----+------+----------- 1 | foo  |         2 3 | bar  |         1   <-- id could very well be "2"

Regards,
Marti

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Simon Riggs

Date:

07 October 2014, 12:23:18

On 2 October 2014 22:37, Bruce Momjian <bruce@momjian.us> wrote:

> OK, it is was just an idea I wanted to point out, and if it doesn't
> work, it more clearly cements that we need UPSERT _and_ MERGE.

It seems clear that having two different initial keywords is popular
because it provides clarity about which aspects of the commands will
be supported.

I like the idea of making the two commands as close as possible in
syntax, which will make it easier to program for and encourage
adoption.
The command name could easily be MERGE [CONCURRENTLY] since that uses
the same concept from earlier DDL syntax/keywords.

In UPSERT, we don't need the ON keyword at all. If we are altering the
syntax, then we can easily remove this.

IIRC it wasn't agreed that we needed to identify which indexes in the
upsert SQL statement itself, since this would be possible in other
ways and would require programmers to know which unique constraints
are declared.

All of the other syntax could easily remain the same, leaving us with
a command that looks like this...

MERGE CONCURRENTLY INTO foo USING VALUES ()
WHEN NOT MATCHED THEN INSERT
WHEN MATCHED THENUPDATE

Since MERGE now supports DELETE and IGNORE as options, presumably we
would also want to support those for the UPSERT version also.
I think it would be useful to also support a mechanism for raising an
error, as DB2 allows.

More complex example of MERGE

MERGE INTO product AS T     USING (SELECT sales.id, sum(sold) AS sold, max(catalog.name) as name    FROM sales, catalog
WHEREsales.id = catalog.id GROUP BY sales.id) AS S       ON S.id = T.id WHEN MATCHED AND T.inventory = S.sold
  THEN DELETE WHEN MATCHED AND T.inventory < S.sold              THEN SIGNAL SQLSTATE '78000' SET MESSAGE_TEXT =

'Oversold: ' || S.name WHEN MATCHED              THEN UPDATE SET inventory = T.inventory - S.sold WHEN NOT MATCHED
      THEN INSERT VALUES(S.id, S.name, -S.sold);

Full example would be similar to this

MERGE CONCURRENTLY INTO product AS T     USING VALUES () WHEN MATCHED AND T.inventory = S.sold              THEN DELETE
WHENMATCHED AND T.inventory < S.sold              THEN SIGNAL SQLSTATE '78000' SET MESSAGE_TEXT =

'Oversold: ' || S.name WHEN MATCHED              THEN UPDATE SET inventory = T.inventory - S.sold WHEN NOT MATCHED
      THEN INSERT VALUES(S.id, S.name, -S.sold);

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

08 October 2014, 00:47:49

On Tue, Oct 7, 2014 at 5:23 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> IIRC it wasn't agreed that we needed to identify which indexes in the
> upsert SQL statement itself, since this would be possible in other
> ways and would require programmers to know which unique constraints
> are declared.

Kevin seemed quite concerned about that. That is something that seems
hard to reconcile with supporting the MERGE syntax. Perhaps Kevin can
comment on that, since he was in favor of both being able to specify
user intent by accepting a unique index, while also being in favor of
the MERGE syntax.

> All of the other syntax could easily remain the same, leaving us with
> a command that looks like this...
>
> MERGE CONCURRENTLY INTO foo USING VALUES ()
> WHEN NOT MATCHED THEN
>   INSERT
> WHEN MATCHED THEN
>  UPDATE
>
> Since MERGE now supports DELETE and IGNORE as options, presumably we
> would also want to support those for the UPSERT version also.
> I think it would be useful to also support a mechanism for raising an
> error, as DB2 allows.

It seems like what you're talking about here is just changing the
spelling of what I already have. I think that would be confusing to
users when the time comes to actually implement a fully-generalized
MERGE, even with the clearly distinct MERGE CONCURRENTLY variant
outlined here (which, of course, lacks an outer join, unlike MERGE
proper).

However, unlike the idea of trying to square the circle of producing a
general purpose MERGE command that also supports the UPSERT use-case,
my objection to this much more limited proposal is made purely on
aesthetic grounds. I think that it is not very user-friendly; I do not
think that it's a total disaster, which is what trying to solve both
problems at once (MERGE bulkloading and UPSERTing) would result in. So
FWIW, if the community is really set on something that includes the
keyword MERGE, which is really all you outline here, then I can live
with that.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Marti Raudsepp

Date:

08 October 2014, 08:37:24

On Wed, Oct 8, 2014 at 3:47 AM, Peter Geoghegan <pg@heroku.com> wrote:
> It seems like what you're talking about here is just changing the
> spelling of what I already have.

I think there's a subtle difference in expectations too. The current
BEFORE INSERT trigger behavior is somewhat defensible with an
INSERT-driven syntax (though I don't like it even now [1]). But the
MERGE syntax, to me, strongly implies that insertion doesn't begin
before determining whether a conflict exists or not.

[1] http://www.postgresql.org/message-id/CABRT9RD6zriK+t6mnqQOqaozZ5z1bUaKh+kNY=O9ZqBZFoAuBg@mail.gmail.com

Regards,
Marti

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

08 October 2014, 09:28:50

On Wed, Oct 8, 2014 at 1:36 AM, Marti Raudsepp <marti@juffo.org> wrote:
> I think there's a subtle difference in expectations too. The current
> BEFORE INSERT trigger behavior is somewhat defensible with an
> INSERT-driven syntax (though I don't like it even now [1]).

There is no way around it. We need to fire before row triggers to know
what to insert on the one hand, but on the other hand (in general) we
have zero ability to nullify the effects (or side-effects) of before
triggers, since they may execute arbitrary user-defined code. I think
there is a good case to be made for severely restricting what before
row triggers can do, but it's too late for that.

> But the
> MERGE syntax, to me, strongly implies that insertion doesn't begin
> before determining whether a conflict exists or not.

I think you're right. Another strike against the MERGE syntax, then,
since as I said we cannot even know what to check prior to having
before row insert triggers fire.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Marti Raudsepp

Date:

08 October 2014, 09:34:10

On Tue, Oct 7, 2014 at 2:27 PM, Marti Raudsepp <marti@juffo.org> wrote:
> but the new approach seems
> surprising: changes from BEFORE INSERT get persisted in the database,
> but AFTER INSERT is not fired.

I am sorry, I realize now that I misunderstood the current proposed
trigger behavior, I thought what Simon Riggs wrote here already
happens:
https://groups.google.com/forum/#!msg/django-developers/hdzkoLYVjBY/bnXyBVqx95EJ

But the point still stands: firing INSERT triggers when the UPDATE
path is taken is counterintuitive.  If we prevent changes of upsert
key columns in BEFORE triggers then we get the benefits, including
more straightforward trigger behavior and avoid problems with serial
columns.

Regards,
Marti

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Marti Raudsepp

Date:

08 October 2014, 10:06:59

On Wed, Oct 8, 2014 at 12:28 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Wed, Oct 8, 2014 at 1:36 AM, Marti Raudsepp <marti@juffo.org> wrote:
>> I think there's a subtle difference in expectations too. The current
>> BEFORE INSERT trigger behavior is somewhat defensible with an
>> INSERT-driven syntax (though I don't like it even now [1]).
>
> There is no way around it. We need to fire before row triggers to know
> what to insert on the one hand, but on the other hand (in general) we
> have zero ability to nullify the effects (or side-effects) of before
> triggers, since they may execute arbitrary user-defined code.

With my proposal this problem disappears: if we prevent BEFORE
triggers from changing key attributes of NEW in the case of upsert,
then we can acquire value locks before firing any triggers (before
even constructing the whole tuple), and have a guarantee that the
value locks are still valid by the time we proceed with the actual
insert/update.

Other than changing NEW, the side effects of triggers are not relevant.

Now, there may very well be reasons why this is tricky to implement,
but I haven't heard any. Can you see any concrete reasons why this
won't work? I can take a shot at implementing this, if you're willing
to consider it.

> I think
> there is a good case to be made for severely restricting what before
> row triggers can do, but it's too late for that.

There are no users of new "upsert" syntax out there yet, so it's not
too late to rehash the semantics of that. This in no way affects users
of old INSERT/UPDATE syntax.

Regards,
Marti

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Simon Riggs

Date:

08 October 2014, 13:25:52

On 8 October 2014 01:47, Peter Geoghegan <pg@heroku.com> wrote:

> It seems like what you're talking about here is just changing the
> spelling of what I already have. I think that would be confusing to
> users when the time comes to actually implement a fully-generalized
> MERGE, even with the clearly distinct MERGE CONCURRENTLY variant
> outlined here (which, of course, lacks an outer join, unlike MERGE
> proper).

I change my view on this, after some more thought. (Hope that helps)

If we implement MERGE, I can see we may also wish to implement MERGE
CONCURRENTLY one day. That would be different to UPSERT.

So in the future I think we will need 3 commands

1. MERGE
2. MERGE CONCURRENTLY
3. UPSERT

So we no longer need to have the command start with the MERGE keyword.

> However, unlike the idea of trying to square the circle of producing a
> general purpose MERGE command that also supports the UPSERT use-case,
> my objection to this much more limited proposal is made purely on
> aesthetic grounds. I think that it is not very user-friendly; I do not
> think that it's a total disaster, which is what trying to solve both
> problems at once (MERGE bulkloading and UPSERTing) would result in. So
> FWIW, if the community is really set on something that includes the
> keyword MERGE, which is really all you outline here, then I can live
> with that.

We will one day have MERGE according to the SQL Standard.

My opinion is that syntax for this should be similar to MERGE in the
*body* of the command, rather than some completely different syntax.
e.g.

> WHEN NOT MATCHED THEN
>   INSERT
> WHEN MATCHED THEN
>  UPDATE

I'm happy that we put that to a vote on what the syntax should be, as
long as we bear in mind that we will one day have MERGE as well.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

08 October 2014, 19:13:05

On Mon, Sep 29, 2014 at 7:21 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Having said that, it would be much nicer to have a mode that allows
> you to just say the word "UPDATE" and have it copy the data into the
> correct columns, like MySQL does. That is very intuitive, even if it
> isn't very flexible.

I thought about this, and at first I agreed, but now I'm not so sure.

Consider the case where you write an INSERT ... ON CONFLICT UPDATE ALL
query, or however we might spell this idea.

1. Developer writes the query, and it works fine.

2. Some time later, the DBA adds an inserted_at column (those are
common). The DBA is not aware of the existence of this particular
query. The new column has a default value of now(), say.

Should we UPDATE the inserted_at column to be NULL? Or (more
plausibly) the default value filled in by the INSERT? Or leave it be?
I think there is a case to be made for all of these behaviors, and
that tension makes me prefer to not do this at all. It's like
encouraging "SELECT *" queries in production, only worse.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

08 October 2014, 19:19:08

On Wed, Oct 8, 2014 at 12:12 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Mon, Sep 29, 2014 at 7:21 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Having said that, it would be much nicer to have a mode that allows
>> you to just say the word "UPDATE" and have it copy the data into the
>> correct columns, like MySQL does. That is very intuitive, even if it
>> isn't very flexible.
>
> I thought about this, and at first I agreed, but now I'm not so sure.

Actually, I don't think MySQL supports this. It doesn't allow INSERT
ON DUPLICATE KEY UPDATE to do it, AFAICT. Their REPLACE syntax
supports that, but that's a feature that is quite distinct to what I
have in mind here.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

08 October 2014, 20:16:09

On Wed, Oct 8, 2014 at 6:25 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> I change my view on this, after some more thought. (Hope that helps)

Great.

> If we implement MERGE, I can see we may also wish to implement MERGE
> CONCURRENTLY one day. That would be different to UPSERT.
>
> So in the future I think we will need 3 commands
>
> 1. MERGE
> 2. MERGE CONCURRENTLY
> 3. UPSERT
>
> So we no longer need to have the command start with the MERGE keyword.

As I've outlined, I don't see how MERGE CONCURRENTLY could ever work,
but I'm glad that you agree that it should not block this effort (or
indeed, some later effort to implement a MERGE that is comparable to
the implementations of other database systems).

> We will one day have MERGE according to the SQL Standard.

Agreed.

> My opinion is that syntax for this should be similar to MERGE in the
> *body* of the command, rather than some completely different syntax.
> e.g.
>
>> WHEN NOT MATCHED THEN
>>   INSERT
>> WHEN MATCHED THEN
>>  UPDATE
>
> I'm happy that we put that to a vote on what the syntax should be, as
> long as we bear in mind that we will one day have MERGE as well.

While I am also happy with taking a vote, if we do so I vote against
even this much less MERGE-like syntax. It's verbose, and makes much
less sense when the mechanism is driven by would-be duplicate key
violations rather than an outer join. I also like that when you UPSERT
with the proposed ON CONFLICT UPDATE syntax, you get all the
flexibility of an INSERT - you can use data-modifying CTEs, and nest
the statement in a data-modifying CTE, and "INSERT ... SELECT... ON
CONFLICT UPDATE ..." . And to be honest, it's much simpler to
implement this whole feature as an adjunct to how INSERT statements
are currently processed (during parse analysis, planning and
execution); I don't want to make the syntax work against that. For
example, consider how little I had to change the grammar to make all
of this work:

$ git diff master --stat | grep gramsrc/backend/parser/gram.y                          |  72 ++-

The code footprint of this patch is relatively small, and I think we
can all agree that that's a good thing.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Petr Jelinek

Date:

08 October 2014, 20:30:06

On 08/10/14 11:28, Peter Geoghegan wrote:
> On Wed, Oct 8, 2014 at 1:36 AM, Marti Raudsepp <marti@juffo.org> wrote:
>> But the
>> MERGE syntax, to me, strongly implies that insertion doesn't begin
>> before determining whether a conflict exists or not.
>
> I think you're right. Another strike against the MERGE syntax, then,
> since as I said we cannot even know what to check prior to having
> before row insert triggers fire.
>

True, but to me it also seems to be strike against using INSERT for this 
as I don't really see how you can make triggers work in a sane way if 
the UPSERT is implemented as part of INSERT (at least I haven't seen any 
proposal that I would consider sane from the user point of view).

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Kevin Grittner

Date:

08 October 2014, 20:33:59

Peter Geoghegan <pg@heroku.com> wrote:
> On Tue, Oct 7, 2014 at 5:23 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> IIRC it wasn't agreed that we needed to identify which indexes in the
>> upsert SQL statement itself, since this would be possible in other
>> ways and would require programmers to know which unique constraints
>> are declared.
>
> Kevin seemed quite concerned about that. That is something that seems
> hard to reconcile with supporting the MERGE syntax. Perhaps Kevin can
> comment on that, since he was in favor of both being able to specify
> user intent by accepting a unique index, while also being in favor of
> the MERGE syntax.

Well, I mostly wanted to make sure we properly considered what the
implications were of using the standard syntax without other
keywords or decorations before deciding to go the non-standard
route.  In spite of an alarming tendency for people to assume that
meant that I didn't understand the desired semantics, I feel enough
people have understood the question and weighed in in favor of an
explicit choice between semantics, rather than inferring
concurrency handling based on the availability of the index
necessary for the slicker behavior.  I'm willing to concede that
overall consensus is leaning toward the view that UPSERT semantics
should be conditioned on explicit syntax; I'll drop that much going
forward.

Granting that, I will say that I lean toward either the MERGE
syntax with CONCURRENTLY being the flag to use UPSERT semantics, or
a separate UPSERT command which is as close to identical to the
MERGE syntax (other than the opening verb) as possible.  I see that
as still needing the ON clause so that you can specify which values
match which columns from the target table.  I'm fine with starting
with the syntax in the standard, which has no DELETE or IGNORE
options (as of the latest version I've seen).  So the syntax I'm
suggesting is close to what Simon is suggesting, but a more
compliant form would be:

MERGE CONCURRENTLY INTO foo USING (VALUES (valuelist) aliases) ON (conditions) WHEN NOT MATCHED   INSERT [ (columnlist)
]VALUES (valuelist) WHEN MATCHED   UPDATE SET colname = expression [, ...]

Rather than pseudo-randomly picking a unique index or using a
constraint or index name, the ON condition would need to allow
matching based on equality to all columns of a unique index which
only referenced NOT NULL columns; we would pick an index which
matched those conditions.  In any event, the unique index would be
required if CONCURRENTLY was specified.  Using column matching to
pick the index (like we do when specifying a FOREIGN KEY
constraint) is more in keeping with other SQL statements, and seems
generally safer to me.  It would also make it fairly painless for
people to switch concurrency techniques for what is, after all,
exactly the same operation except for differences in handling of
concurrent conflicting DML.

As I said, I'm also OK with using UPSERT in place of MERGE
CONCURRENTLY.

I also feel that if we could allow:
 USING (VALUES (valuelist) [, ...])

that would be great.  In fact, I don't see why that can't be pretty
much any relation, but it doesn't have to be for a first cut.  A
relation would allow a temporary table to be loaded with a batch of
rows where the intent is to UPSERT every row in the batch, without
needing to write a loop to do it.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Simon Riggs

Date:

08 October 2014, 21:04:29

On 8 October 2014 21:16, Peter Geoghegan <pg@heroku.com> wrote:

>> My opinion is that syntax for this should be similar to MERGE in the
>> *body* of the command, rather than some completely different syntax.
>> e.g.
>>
>>> WHEN NOT MATCHED THEN
>>>   INSERT
>>> WHEN MATCHED THEN
>>>  UPDATE
>>
>> I'm happy that we put that to a vote on what the syntax should be, as
>> long as we bear in mind that we will one day have MERGE as well.
>
> While I am also happy with taking a vote, if we do so I vote against
> even this much less MERGE-like syntax. It's verbose, and makes much
> less sense when the mechanism is driven by would-be duplicate key
> violations rather than an outer join.

It wouldn't be driven by an outer join, not sure where that comes from.

MERGE is verbose, I agree. I don't always like the SQL Standard, I
just think we should follow it as much as possible. You can't change
the fact that MERGE exists, so I don't see a reason to have two
variants of syntax that do roughly the same thing.

MERGE syntax would allow many things, such as this...
WHEN NOT MATCHED AND x > 7 THEN INSERT
WHEN NOT MATCHED THEN INSERT
WHEN MATCHED AND y = 5 THEN DO NOTHING
WHEN MATCHED THENUPDATE

etc

> I also like that when you UPSERT
> with the proposed ON CONFLICT UPDATE syntax, you get all the
> flexibility of an INSERT - you can use data-modifying CTEs, and nest
> the statement in a data-modifying CTE, and "INSERT ... SELECT... ON
> CONFLICT UPDATE ..." . And to be honest, it's much simpler to
> implement this whole feature as an adjunct to how INSERT statements
> are currently processed (during parse analysis, planning and
> execution); I don't want to make the syntax work against that.

I spoke to someone today that preferred a new command keyword, like
UPSERT, because the semantics of triggers are weird. Having a before
insert trigger fire when there is no insert is quite strange. Properly
documenting that on hackers would help; has the comments made on the
Django list been replayed here in some form?

I'm very scared by your comments about data modifying CTEs etc.. You
have no definition of how they will work, not tests of that. That part
isn't looking like a benefit as things currently stand.

I'm still waiting for some more docs to describe your intentions so
they can be reviewed.

Also, it would be useful to hear that your're going to do something
about the references to rows using conflicting(), since nobody has
agreed with you there. Or hopefully even that you've listened and
implemented something differently already. (We need that, whatever we
do with other elements of syntax).

Overall, I'm not seeing too many comments that indicate you are
accepting review comments and acting upon them. If you want acceptance
from others, you need to begin with some yourself.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

08 October 2014, 22:24:59

On Wed, Oct 8, 2014 at 2:04 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> While I am also happy with taking a vote, if we do so I vote against
>> even this much less MERGE-like syntax. It's verbose, and makes much
>> less sense when the mechanism is driven by would-be duplicate key
>> violations rather than an outer join.
>
> It wouldn't be driven by an outer join, not sure where that comes from.

Right, I understood that it wouldn't be - which is the point. So with
an UPSERT that retains influence from MERGE, NOT MATCHED means "no
conflict", MATCHED means "conflict". That just seems like an odd way
to spell the concept given, as you say, that we're not talking about
an outer join.

> MERGE is verbose, I agree. I don't always like the SQL Standard, I
> just think we should follow it as much as possible. You can't change
> the fact that MERGE exists, so I don't see a reason to have two
> variants of syntax that do roughly the same thing.
>
> MERGE syntax would allow many things, such as this...
> WHEN NOT MATCHED AND x > 7 THEN
>   INSERT
> WHEN NOT MATCHED THEN
>   INSERT
> WHEN MATCHED AND y = 5 THEN
>   DO NOTHING
> WHEN MATCHED THEN
>  UPDATE
>
> etc

But then you can have before row insert triggers fire, which as you
acknowledge is more surprising with this syntax.

> I spoke to someone today that preferred a new command keyword, like
> UPSERT, because the semantics of triggers are weird. Having a before
> insert trigger fire when there is no insert is quite strange. Properly
> documenting that on hackers would help; has the comments made on the
> Django list been replayed here in some form?

Yes. It's also mentioned in the commit message of CONFLICTING() (patch
0003-*). And the documentation (both the proposed INSERT
documentation, and the trigger documentation). There is a large
comment on it in the code. So I've said it many times.

> I'm very scared by your comments about data modifying CTEs etc.. You
> have no definition of how they will work, not tests of that. That part
> isn't looking like a benefit as things currently stand.

Actually, I have a few smoke tests for that. But I don't see any need
for special handling. When you have a data-modifying CTE, it can
contain an INSERT, and there are no special restrictions on that
INSERT (other than that it may not itself have a CTE, but that's true
more generally). You can have data-modifying CTEs containing INSERTs,
and data-modifying CTEs containing UPDATEs....what I've done is have
data-modifying CTEs contain INSERTs that also happen to have an ON
CONFLICT UPDATE clause.

This new clause of INSERTs is in no more need of special documentation
regarding interactions with data-modifying CTEs than UPDATE .... WHERE
CURRENT OF is. The only possible exception I can think of would be
cardinality violations where a vanilla INSERT in one part of a command
(one data-modifying CTE) gives problems to the "UPSERT part" of the
same command (because we give a special cardinality violation message
when we try to update the same tuple twice in the same command). But
that's a pretty imaginative complaint, and I doubt it would really
surprise someone.

Why would you be surprised by the fact that a new clause for INSERT
plays nicely with existing clauses? It's nothing special - there is no
special handling.

> I'm still waiting for some more docs to describe your intentions so
> they can be reviewed.

I think it would be useful to add several more isolation tests,
highlighting some of the cases you talked about. I'll work on that.
While the way forward for WITHIN isn't clear, I think a WITHIN PRIMARY
KEY variant would certainly be useful. Maybe it would be okay to
forget about naming a specific unique index, while supporting an
(optional) WITHIN PRIMARY KEY/NOT WITHIN PRIMARY KEY. It doesn't
totally solve the problems, but may be a good compromise that mostly
satisfies people that want to be able to clearly indicate user intent
(Kevin, in particular), and satisfies other people that don't want to
name a unique index (Heikki, in particular). Certainly, the Django
people would like that, since they said as much.

> Also, it would be useful to hear that your're going to do something
> about the references to rows using conflicting(), since nobody has
> agreed with you there. Or hopefully even that you've listened and
> implemented something differently already. (We need that, whatever we
> do with other elements of syntax).

Do you really expect me to do major work on some aspect of the syntax
like that, given, as you say, that nobody explicitly agreed with me
(and only you disagreed with me)? The only remark I heard on that was
from you (you'd prefer to use NEW.* and OLD.*). But you spent much
more time talking about getting something MERGE-like, which
NEW.*/OLD.* clearly isn't.

CONFLICTING() is very close (identical?) to MySQL's use of "ON
DUPLICATE KEY UPDATE val = VALUES(val)". I'm happy to discuss that,
but it's news to me that people take particular issue with it.

> Overall, I'm not seeing too many comments that indicate you are
> accepting review comments and acting upon them. If you want acceptance
> from others, you need to begin with some yourself.

What, specifically, have I failed to act on? We are discussing the
syntax here. I have very valid practical reasons for wanting to make
this feature a clause of INSERT. That is a view that Andres seemed to
agree with [1], for example.

[1] http://www.postgresql.org/message-id/20140929070235.GP1169@alap3.anarazel.de
-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Simon Riggs

Date:

09 October 2014, 05:49:55

On 8 October 2014 23:24, Peter Geoghegan <pg@heroku.com> wrote:

>> Also, it would be useful to hear that your're going to do something
>> about the references to rows using conflicting(), since nobody has
>> agreed with you there. Or hopefully even that you've listened and
>> implemented something differently already. (We need that, whatever we
>> do with other elements of syntax).
>
> Do you really expect me to do major work on some aspect of the syntax
> like that, given, as you say, that nobody explicitly agreed with me
> (and only you disagreed with me)? The only remark I heard on that was
> from you (you'd prefer to use NEW.* and OLD.*).
> But you spent much
> more time talking about getting something MERGE-like, which
> NEW.*/OLD.* clearly isn't.

Yes, it is. Look at the AS clause.

> CONFLICTING() is very close (identical?) to MySQL's use of "ON
> DUPLICATE KEY UPDATE val = VALUES(val)". I'm happy to discuss that,
> but it's news to me that people take particular issue with it.

3 people have asked you questions or commented about the use of
CONFLICTING() while I've been watching. It's clearly a non-standard
mechanism and not inline with other Postgres usage. Nobody actually
says "I object to this" - do they need to use that phrase before you
take note?

I'm beginning to feel that giving you review comments is being seen as
some kind of negative action. Needing to repeat myself makes it clear
that you aren't taking note.

Yes, I expect you to do these things
* collect other people's input, even if you personally disagree
* if there is disagreement amongst reviewers, seek to resolve that in
a fair and reasonable manner
* publish a summary of changes requested
* do major work to address them

So yes, I really expect that.

It doesn't matter that it is "only SImon" or "only Kevin". **One**
comment is enough for you to take note.

If there is disagreement, publishing the summary of changes you plan
to make in your next version will help highlight that.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

09 October 2014, 06:27:29

On Wed, Oct 8, 2014 at 10:49 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Do you really expect me to do major work on some aspect of the syntax
>> like that, given, as you say, that nobody explicitly agreed with me
>> (and only you disagreed with me)? The only remark I heard on that was
>> from you (you'd prefer to use NEW.* and OLD.*).
>> But you spent much
>> more time talking about getting something MERGE-like, which
>> NEW.*/OLD.* clearly isn't.
>
> Yes, it is. Look at the AS clause.

You can alias each of the two tables being joined. But I only have one
table, and no join. When you referred to NEW.* and OLD.*, you clearly
were making a comparison with trigger WHEN clauses, and not MERGE
(which is a comparison I made myself, although for more technical
reasons). It hardly matters, though.

>> CONFLICTING() is very close (identical?) to MySQL's use of "ON
>> DUPLICATE KEY UPDATE val = VALUES(val)". I'm happy to discuss that,
>> but it's news to me that people take particular issue with it.
>
> 3 people have asked you questions or commented about the use of
> CONFLICTING() while I've been watching.

Lots of people have asked me lots of questions. Again, as I said, I
wasn't aware that CONFLICTING() was a particular point of contention.
Please be more specific.

> It's clearly a non-standard
> mechanism and not inline with other Postgres usage.

What would be "inline with other Postgres usage"? I don't think you've
been clear on what you think is a better alternative.

I felt a function-like expression was appropriate because the user
refers to different tuples of the target table. It isn't like a join.
Plus it's similar to the MySQL thing, but doesn't misuse VALUES() as a
function-like thing.

> If there is disagreement, publishing the summary of changes you plan
> to make in your next version will help highlight that.

I think I've done a pretty good job of collecting and collating the
opinions of others, fwiw.
-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Simon Riggs

Date:

09 October 2014, 07:38:09

On 9 October 2014 07:27, Peter Geoghegan <pg@heroku.com> wrote:

> Please be more specific.

Do not use CONFLICTING() which looks like it is a function.

Instead, use a row qualifier, such as NEW, OLD etc to reference values
from the incoming data
e.g. CONFLICTING.value rather than CONFLICTING(value)

Do not use the word CONFLICTING since it isn't clear whether you are
referring to the row in the table or the value in the incoming data. I
suggest the use of two separately named row qualifiers to allow us to
use either of those when desired. I don't have suggestions as to what
you should call those qualifiers, though Postgres already uses NEW and
OLD in similar circumstances in triggers. (This has nothing at all to
do with the MERGE command in the SQL standard, so please don't mention
that here.)

You may also wish to support the AS keyword, as MERGE does to make the
above even more clear.

e.g. SET col = EXISTING.col + NEW.col

Thank you.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

09 October 2014, 08:11:25

On Thu, Oct 9, 2014 at 12:38 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Do not use CONFLICTING() which looks like it is a function.

So is ROW(). Or COALESCE().

> Instead, use a row qualifier, such as NEW, OLD etc to reference values
> from the incoming data
> e.g. CONFLICTING.value rather than CONFLICTING(value)
>
> Do not use the word CONFLICTING since it isn't clear whether you are
> referring to the row in the table or the value in the incoming data.

If you don't have a word that you think would more clearly indicate
the intent of the expression, I'm happy to hear suggestions from
others.

> You may also wish to support the AS keyword, as MERGE does to make the
> above even more clear.
>
> e.g. SET col = EXISTING.col + NEW.col

That's less clear, IMV. EXISTING.col is col - the very same Var. So
why qualify that it's the existing value in one place but not the
other? In fact, you can't do that now with updates in general:

postgres=# update upsert u set u.val = 'foo';
ERROR:  42703: column "u" of relation "upsert" does not exist
LINE 1: update upsert u set u.val = 'foo';                           ^
LOCATION:  transformUpdateStmt, analyze.c:2068

This does work, which is kind of what you outline:

postgres=# update upsert u set val = u.val;
UPDATE 3

But MERGE accepts the former in other systems (in general, and for
MERGE), where Postgres won't (for UPDATEs in general). Parse analysis
of UPDATE targetlists just rejects this outright.

FWIW, is any of the two tuples reference here "NEW", in any sense?
Informally, I'd say the new value is the resulting row - the final row
value after the UPDATE. We want to refer to the existing row, and the
row proposed for insertion (with all before trigger effects carried
forward).

Having the column reference go through an alias like this might be tricky.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Marti Raudsepp

Date:

09 October 2014, 08:33:44

On Thu, Oct 9, 2014 at 11:11 AM, Peter Geoghegan <pg@heroku.com> wrote:
> On Thu, Oct 9, 2014 at 12:38 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Do not use CONFLICTING() which looks like it is a function.
>
> So is ROW(). Or COALESCE().

ROW and COALESCE behave almost like functions: they operate on any
expression or value you pass to them.

db=# select coalesce('bar');coalesce
----------bar

Not so with CONFLICTING(), it only accepts a column name -- not a
value -- and has knowledge of the surrounding statement that ordinary
function-like constructs don't.

db=# INSERT into evt_type (name) values ('foo') on conflict UPDATE set
name=conflicting('bar');
ERROR:  syntax error at or near "'bar'"
LINE 1: ...lues ('foo') on conflict UPDATE set name=conflicting('bar');

> If you don't have a word that you think would more clearly indicate
> the intent of the expression, I'm happy to hear suggestions from
> others.

I also like NEW due to similarity with triggers, but I see your
concern about it not actually being "new".

Regards,
Marti

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Simon Riggs

Date:

09 October 2014, 08:42:07

On 9 October 2014 09:11, Peter Geoghegan <pg@heroku.com> wrote:

>> You may also wish to support the AS keyword, as MERGE does to make the
>> above even more clear.
>>
>> e.g. SET col = EXISTING.col + NEW.col
>
> That's less clear, IMV. EXISTING.col is col - the very same Var. So
> why qualify that it's the existing value in one place but not the
> other? In fact, you can't do that now with updates in general:
>
> postgres=# update upsert u set u.val = 'foo';
> ERROR:  42703: column "u" of relation "upsert" does not exist
> LINE 1: update upsert u set u.val = 'foo';
>                             ^
> LOCATION:  transformUpdateStmt, analyze.c:2068

YES, which is exactly why I did not say this, I said something different.

> This does work, which is kind of what you outline:
>
> postgres=# update upsert u set val = u.val;
> UPDATE 3

YES, which is why I said it.

> But MERGE accepts the former in other systems (in general, and for
> MERGE), where Postgres won't (for UPDATEs in general). Parse analysis
> of UPDATE targetlists just rejects this outright.
>
> FWIW, is any of the two tuples reference here "NEW", in any sense?
> Informally, I'd say the new value is the resulting row - the final row
> value after the UPDATE. We want to refer to the existing row, and the
> row proposed for insertion (with all before trigger effects carried
> forward).

YES, which is why I specifically requested the ability to reference
"the incoming data".

Common sense interpretations make for quicker and easier discussions.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

09 October 2014, 08:54:07

On Thu, Oct 9, 2014 at 1:33 AM, Marti Raudsepp <marti@juffo.org> wrote:
> ROW and COALESCE behave almost like functions: they operate on any
> expression or value you pass to them.

Okay, then like CONFLICTING() is like many of the XML expressions.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

09 October 2014, 08:56:43

On Thu, Oct 9, 2014 at 1:41 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> YES, which is why I specifically requested the ability to reference
> "the incoming data".

My point is that people are not really inclined to use an alias in
UPDATEs in general when referring to the target. The thing that seems
special (and worthy of special qualification) is the reference to what
you call the "incoming data", and what I've called "tuples proposed
for insertion" (after being affected by any before row triggers).

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

09 October 2014, 08:58:35

On Thu, Oct 9, 2014 at 1:56 AM, Peter Geoghegan <pg@heroku.com> wrote:
> My point is that people are not really inclined to use an alias in
> UPDATEs in general when referring to the target. The thing that seems
> special (and worthy of special qualification) is the reference to what
> you call the "incoming data", and what I've called "tuples proposed
> for insertion" (after being affected by any before row triggers).

For simple cases, you might not even bother with CONFLICTING() - you
might find it easier to just repeat the constant in the INSERT and
UPDATE parts of the query.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

24 October 2014, 01:43:26

V 1.3 is attached. Like Simon, I think that it's premature to commit
to one particular value locking implementation. Personally, I think
that approach #2 is what we'll end up using, but it makes no sense to
not maintain both at once, since it requires relatively little effort
to do so. At the very least it's a useful tool for reviewers, who
would otherwise be denied the opportunity to test whether any given
concurrency problem was attributable to the value locking
implementation, or something else. So there are two variants of V 1.3
attached - one uses an unchanged value locking implementation #1,
while the other an unchanged implementation #2.

Highlights
========

* No more "WITHIN unique_index_name". There is a new syntax that
supersedes it. Unique index inference based on columns (or
expressions) is *mandatory* for the ON CONFLICT UPDATE variant. It
remains optional for the IGNORE variant, because I think someone could
very reasonably not care which unique index was implicated. As
discussed, this implies that partial unique indexes are no longer
supported (but expression indexes work just fine). This may be
revisited, but I suggest doing so in a later release. Example of
merging on the unique index on column "name":

postgres=# explain insert into capitals(name, population, altitude)
values ('Riga', 1000, 5) on conflict (name) update set altitude =
excluded(altitude);
                              QUERY PLAN
----------------------------------------------------------------------
 Insert on capitals  (cost=0.00..0.01 rows=1 width=0)
   ->  Result  (cost=0.00..0.01 rows=1 width=0)
   ->  Conflict Update on capitals  (cost=0.00..1.01 rows=1 width=52)
(3 rows)

When we cannot infer a unique index, it looks like this:

postgres=# explain insert into capitals(name, population, altitude)
values ('Riga', 1000, 5) on conflict (population) update set altitude
= excluded(altitude);
ERROR:  42P10: could not infer which unique index to use from
expressions/columns provided for ON CONFLICT
LINE 1: ...e, population, altitude) values ('Riga', 1000, 5) on conflic...
                                                             ^
HINT:  Partial unique indexes are not supported.
LOCATION:  transformConflictClause, parse_clause.c:2407


* No more planner kludges. Index paths are never created for the
auxiliary UPDATE plan to begin with. To be tidy, TID scan paths are
never added either. Unless the UPDATE will never proceed due to a
tautological predicate like "WHERE false", we're guaranteed to have a
"sequential scan" in a fairly principled way (as before, we just use
this with EvalPlanQual(); it's just an implementation detail). I am
not entirely qualified to say so, but I think that this makes my
modifications to the optimizer look quite reasonable. The kludge of
enforcing various restrictions (e.g. on subqueries appearing in the
UPDATE) in the optimizer is also removed. We prefer to enforce
everything during parse analysis. This also gets us better error
messages, which is nice.

* Sane (although limited) support for table inheritance and updatable
views. However, in general user-defined rules are unsupported - I
cannot see how that could be made sane. The IGNORE variant works for
updatable views, and for inheritance relations with children (provided
that there is no inference required, which effectively makes the
UPDATE variant unsupported). However, both IGNORE and UPDATE variants
work for relations that happen to be in an inheritance hierarchy,
provided they have no children. I think it's fine to only support the
IGNORE variant for relations with inheritance children, because even
when users have the "partitioning pattern" use-case, there is no
principled way of telling the difference between a vanilla INSERT, and
an INSERT with an ON CONFLICT UPDATE clause from within the custom
redirection trigger function. Also, unique indexes already only work
at the relation level with inheritance - that's a long-standing
limitation. And so, in general INSERTs better have the right
inheritance child as their target - this is no more and no less true
when there is an ON CONFLICT UPDATE clause (note that I'm talking
about the "object orientated" inheritance use-case here, and not the
"partitioning pattern" use-case - with the latter, it's all up to the
trigger function to do the right thing). There may currently be an
"UPDATE ONLY", but there is no "INSERT ONLY" - why should I add one?

* Both INSERT and UPDATE sets of statement level triggers fire for
INSERT with ON CONFLICT UPDATE. The number of rows affected doesn't
matter, nor does it matter how they were or were not affected. This
certainly seems like the correct behavior. Per-row triggers work the
same as before, since far more thought went into their behavior
earlier. This incorporates feedback from Kevin.

* CONFLICTING() is renamed to EXCLUDED(). I know that some people
wanted me to go a different way with this. I think that there are very
good practical reasons not to [1], as well as good reasons related to
design, but I still accept that CONFLICTING() isn't very descriptive.
This spelling seems a lot better.

Clean-up
=======

If you take a look at the EXPLAIN output above, you'll see that the
sequential scan node does not appear. Basically, I'm back to
suppressing the implementation detail of that never-executed
"sequential scan", but the new approach is far better than earlier
approaches. Certain things actually associated with the sequential
scan are now attributed to the parent (auxiliary) ModifyTable UPDATE
node. Example (incidentally, note that "key" is used to infer a unique
index to take as an arbiter index):

postgres=# explain insert into upsert values (1000, 'Plucky') on
conflict (key) update set val = excluded(val) where key = 5;
                             QUERY PLAN
---------------------------------------------------------------------
 Insert on upsert  (cost=0.00..0.01 rows=1 width=0)
   ->  Result  (cost=0.00..0.01 rows=1 width=0)
   ->  Conflict Update on upsert  (cost=0.00..25.38 rows=6 width=36)
         Filter: (key = 5)
(4 rows)

This seems reasonable to me. Including the "(never executed)"
sequential scan would be very confusing to users.

Inference
=======

Unique index inference (i.e. the way we figure out  *which* unique
index to use) occurs during parse analysis. I think it would be
inappropriate, and certainly inconvenient to do it during planning. I
maintain that ON CONFLICT DML statements have a legitimate semantic
dependence on particular unique indexes, which makes this appropriate.
I don't know about others, but my only problem with naming unique
indexes directly was that it is unmaintainable, and very ugly. In
principle, I think this semantic dependence is quite reasonable, and
reflects the reality of how we all expect this to work. There are
comments on the implications for plan caching and how that relates to
where and when we perform unique index inference.

Documentation
===========

The documentation has been updated, incorporating feedback. I also
made the cardinality violation error a lot clearer than before, since
Craig said that was unclear.

Tests
====

Many tests were added. I've added a couple of new isolation tests.
insert-conflict-update-3 should be of particular interest. That test
illustrates the visibility issues with the WHERE clause that I've
already highlighted as a possible concern [2]. Unique index inference
tests will also give you a fair idea of how flexible it is.

Remaining open items
=================

Apart from the obvious issue of value locking (i.e. verifying the
correctness of its implementation in general), the only open items
are:

* RLS needs to be considered. I have yet to give it any real thought.

* I could probably do better at postgres_fdw support. That seems like
something that could be followed up on later, because it's clearly
just about a Simple Matter of Programming.

In summary, I was able to remove a lot of TODO/FIXME items here -
almost all of them. I'm pretty happy about that. I'll have to edit the
UPSERT wiki page, to strike out many open items...

[1] http://www.postgresql.org/message-id/CAM3SWZQhiXQi1osT14V7spjQrUpmcnRtbXJe846-EB1bC+9i1g@mail.gmail.com
[2]
https://wiki.postgresql.org/wiki/UPSERT#Visibility_issues_and_the_proposed_syntax_.28WHERE_clause.2Fpredicate_stuff.29
--
Peter Geoghegan

On Mon, Oct 27, 2014 at 5:15 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> Let's see if we can link these two thoughts.
>>
>> 1. You think the biggest problem is the lack of attention to the design.
>>
>> 2. I keep asking you to put the docs in a readable form.
>>
>> If you can't understand the link between those two things, I am at a loss.
>
> You've read the docs. Please be clearer. In what sense are they not
> readable? The main description of the feature appears on the INSERT
> reference page:
>
> http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html

I've updated that reference page. I did a fair amount of copy-editing,
but also updated the docs to describe the latest (unpublished)
refinements to the syntax. Which is, as you and Robert requested, that
the target and rejected-for-insertion tuples may be referenced with
magical aliases in the style of OLD.* and NEW.*. I've spelt these
aliases as TARGET.* and EXCLUDED.*, since OLD.* and NEW.* didn't seem
to make much sense here. This requires some special processing during
rewriting (which, as you probably know, is true of DML statements in
general), and is certainly more invasive than what I had before, but
all told isn't too bad. Basically, there is still an ExcludedExpr, but
it only appears in the post-rewrite query tree, and is never created
by the raw grammar or processed during parse analysis.

I attach the doc patch with the relevant changes, in case you'd like a
quick reference to where things are changed.

I have already implemented the two things that you and Robert asked
for most recently: A costing model for unique index inference, and the
above syntax. I've also added IGNORE support to postgres_fdw (so you
can IGNORE if and only if a unique index inference specification is
omitted, just as with updatable views since V1.3).

Currently, I'm working on fixing an issue with RLS that I describe in
detail here:

https://wiki.postgresql.org/wiki/UPSERT#RLS

Once I fix that (provided it doesn't take too long), I'll publish a
V1.4. AFAICT, that'll close out all of the current open issues.

I hope this goes some way towards addressing your concerns.
--
Peter Geoghegan

Attachment

0006-User-visible-documentation-for-INSERT-.-ON-CONFLICT-.patch

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

10 November 2014, 23:33:19

On Wed, Nov 5, 2014 at 1:09 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Once I fix that (provided it doesn't take too long), I'll publish a
> V1.4. AFAICT, that'll close out all of the current open issues.

Attached is V1.4. As with V1.3, I continue to maintain both approaches
to value locking in parallel, believing this to be the most useful
direction for development to take for the time being. The consensus is
for approach #2 to value locking [1], but I see no reason to deny
reviewers the chance to compare both approaches. It's easy to maintain
the two, as the value locking implementation is well encapsulated -
The executor level stuff that has been altered in the last few
revisions tends to cause very few or no conflicts when rebasing.

Highlights
=======

* Costing of indexes for the purposes of determining which to have
arbitrate whether or not the executor takes the alternative path. So,
a list of expressions is created during parse analysis, and that list
is matched against existing indexes during optimization. It's usually
possible to avoid the work of generating paths, because (it seems
reasonable to suppose) there is usually 0 or 1 possible indexes in
representative cases. If it's 0, we get an error, originating from
where we now do this work -- the optimizer.

* EXCLUDED.* (and TARGET.*) pseudo-aliases (compare OLD.* and NEW.* in
the context of user-defined rules and conditional triggers) are
visible within auxiliary UPDATE (but not parent INSERT). See the
commit message for details on how that works. In short, we still have
a dedicated primnode expression, ExcludedExpr, but it is not ever
generated by the raw grammar (it can only be added by the during the
rewriting stage of query processing). It's just a facade, but a
perfectly convincing one. Note that this means that Vars can be
referenced from "another RTE" in what is actually a relation scan node
of the target:

postgres=# explain INSERT INTO upsert values(1, 'foo') on conflict
(key) update set val = excluded.val where excluded.val != 'bar';
                               QUERY PLAN
------------------------------------------------------------------------
 Insert on upsert  (cost=0.00..0.01 rows=1 width=0)
   ->  Result  (cost=0.00..0.01 rows=1 width=0)
   ->  Conflict Update on upsert  (cost=0.00..32.99 rows=1591 width=36)
         Filter: ((excluded.val) <> 'bar'::text)
(4 rows)

Here, you're seeing a "Conflict Update" scan (actually, a quasi-hidden
sequential scan) on the upsert table that references a Var from the
facade excluded.* table/RTE. In fact, the Var is on the target table,
but read through our internal expression primnode (ExcludedExpr) so as
to get access to the excluded-from-insertion tuple slot during EPQ
expression evaluation for the UPDATE.

* postgres_fdw support for the IGNORE variant (provided there was no
unique index inference specification - just as with updatable views).

* Documentation clean-up - as I mentioned, I tried to address Simon's
concerns here. Also, as you'd expect, the documentation has been fixed
up to reflect the new syntax. I'll need to take a pass at updating the
UPSERT Wiki page soon, too.

Next steps
========

AFAICT, this revision addresses all open items bar one - the RLS bug,
which I could not decide on a fix for. I refer to the RLS issue
described on the Wiki [2].  As I mentioned before, I'd really like to
get some reviewer time on the executor level aspects of this, which
are relatively new, and have received no scrutiny from anyone else
that I'm aware of. This list of items is a good place to start, for
those that are interested:

https://wiki.postgresql.org/wiki/UPSERT#Miscellaneous_odd_properties_of_proposed_ON_CONFLICT_patch

My use of the EvalPlanQual() mechanism, and the structure of the plan
tree in general could really use some scrutiny too.

Thanks

[1] https://wiki.postgresql.org/wiki/Value_locking#.232._.22Promise.22_heap_tuples_.28Heikki_Linnakangas.29
[2] https://wiki.postgresql.org/wiki/UPSERT#RLS
--
Peter Geoghegan

On Mon, Nov 24, 2014 at 1:03 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Looks like the consensus is that we should have RETURNING project
> updated tuples too, then.

Attached revision, v1.5, establishes this behavior (as always, there
is a variant for each approach to value locking). There is a new
commit with a commit message describing the new RETURNING/command tag
behavior in detail, so no need to repeat it here. The documentation
has been updated in these areas, too.

There is also one or two tiny comment tweaks here and there, as well
as a pg_proc OID collision fix in the case of the value locking
approach #1 variant.

My mirror of the documentation (i.e. a html build) has been updated.
INSERT command documentation (for new RETURNING behavior):

http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html

Details of changes to command tag:

http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/protocol-message-formats.html

I'll make a pass at the Wiki page to reflect these changes soon.
--
Peter Geoghegan

Attached revision, v1.6, slightly tweaks the ordering of per-statement
trigger execution. The ordering is now explicitly documented (the html
mirror has been updated:
http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/trigger-definition.html).

As always, there is a variant for each approach to value locking.

This revision fixes bitrot that developed when the patchset was
applied on master's tip, and also cleans up comments regarding how the
parent insert carries auxiliary/child state through all stages of
query processing. That should structure be clearer now, including how
setrefs.c has the auxiliary/child ModifyTable use the same
resultRelation as its parent.

--
Peter Geoghegan

On Mon, Dec 8, 2014 at 8:16 PM, Peter Geoghegan <pg@heroku.com> wrote:

Attached revision, v1.6, slightly tweaks the ordering of per-statement
trigger execution. The ordering is now explicitly documented (the html
mirror has been updated:
http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/trigger-definition.html).

As always, there is a variant for each approach to value locking.

This revision fixes bitrot that developed when the patchset was
applied on master's tip, and also cleans up comments regarding how the
parent insert carries auxiliary/child state through all stages of
query processing. That should structure be clearer now, including how
setrefs.c has the auxiliary/child ModifyTable use the same
resultRelation as its parent.

If I build either option of the patch under MinGW, I get an error in the grammar files related to the IGNORE reserved word.

$ (./configure --host=x86_64-w64-mingw32 --without-zlib && make && make check) > /dev/null

In file included from ../../../src/include/parser/gramparse.h:29:0,

from gram.y:59:

../../../src/include/parser/gram.h:207:6: error: expected identifier before numeric constant

In file included from gram.y:14366:0:

I don't get this problem on Linux.

The build chain seems to meet the specified minimum:

flex.exe 2.5.35

bison (GNU Bison) 2.4.2

This is perl, v5.8.8 built for msys-64int

It seems like IGNORE is getting replaced by the preprocessor with something else, but I don't know how to get my hands on the intermediate file after the preprocessor has done its thing.

Also, in both Linux and MinGW under option 1 patch I get an OID conflict on OID 3261.

Cheers,

Jeff

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Tom Lane

Date:

16 December 2014, 00:33:41

Jeff Janes <jeff.janes@gmail.com> writes:
> It seems like IGNORE is getting replaced by the preprocessor with something
> else, but I don't know how to get my hands on the intermediate file after
> the preprocessor has done its thing.

Maybe IGNORE is defined as a macro in MinGW?
Try s/IGNORE/IGNORE_P/g throughout the patch.
        regards, tom lane

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

16 December 2014, 00:59:26

On Mon, Dec 15, 2014 at 4:22 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> Also, in both Linux and MinGW under option 1 patch I get an OID conflict on
> OID 3261.

I'll take a pass at fixing this bitrot soon. I'll follow Tom's advice
about macro collisions on MinGW while I'm at it, since his explanation
seems plausible.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

16 December 2014, 01:05:37

On Mon, Dec 15, 2014 at 4:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jeff Janes <jeff.janes@gmail.com> writes:
>> It seems like IGNORE is getting replaced by the preprocessor with something
>> else, but I don't know how to get my hands on the intermediate file after
>> the preprocessor has done its thing.
>
> Maybe IGNORE is defined as a macro in MinGW?
> Try s/IGNORE/IGNORE_P/g throughout the patch.

BTW, the gcc -E flag does this. So figure out what exact arguments
MinGW's gcc is passed in the ordinary course of compiling gram.c, and
prepend "-E" to the list of existing flags while manually executing
gcc -- that should let you know exactly what's happening here.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

16 December 2014, 07:06:10

On Mon, Dec 15, 2014 at 4:59 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Mon, Dec 15, 2014 at 4:22 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>> Also, in both Linux and MinGW under option 1 patch I get an OID conflict on
>> OID 3261.
>
> I'll take a pass at fixing this bitrot soon. I'll follow Tom's advice
> about macro collisions on MinGW while I'm at it, since his explanation
> seems plausible.

Attached pair of revised patch sets fix the OID collision, and
presumably fix the MinGW issue (because IGNORE_P is now used as a
token name). It also polishes approach #2 to value locking in a few
places (e.g. better comments). Finally, both patches have a minor
buglet around EXPLAIN ANALYZE output fixed -- the output now indicates
if tuples are pulled up from auxiliary update nodes.

--
Peter Geoghegan

Attachment

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Jeff Janes

Date:

16 December 2014, 19:08:52

On Mon, Dec 15, 2014 at 5:05 PM, Peter Geoghegan <pg@heroku.com> wrote:

>
> Maybe IGNORE is defined as a macro in MinGW?
> Try s/IGNORE/IGNORE_P/g throughout the patch.

BTW, the gcc -E flag does this. So figure out what exact arguments
MinGW's gcc is passed in the ordinary course of compiling gram.c, and
prepend "-E" to the list of existing flags while manually executing
gcc -- that should let you know exactly what's happening here.

Yep, I tried that trick and had decided it didn't work in MinGW. But I think it was a user error--I must have somehow broken up the build tree and 'make' didn't detect the problem. Now I see that IGNORE is getting turned to 0.

Your new version 1.7 of the patches fixes that issue, as well as the OID conflict.

Thanks,

Jeff

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

16 December 2014, 21:11:16

On Tue, Dec 16, 2014 at 11:08 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> Your new version 1.7 of the patches fixes that issue, as well as the OID
> conflict.

Good.

You're probably aware that I maintain a stress testing suite for the
patch here: https://github.com/petergeoghegan/upsert

In the past, you've had a lot of success with coming up with stress
tests that find bugs. Maybe you can come up with some improvements to
the suite, if you'd care to test the patch. I can authorize your
Github account to push code to that repo, if you're interested.
-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Gavin Flower

Date:

16 December 2014, 22:04:46

On 17/12/14 10:11, Peter Geoghegan wrote:
> On Tue, Dec 16, 2014 at 11:08 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
>> Your new version 1.7 of the patches fixes that issue, as well as the OID
>> conflict.
> Good.
>
> You're probably aware that I maintain a stress testing suite for the
> patch here: https://github.com/petergeoghegan/upsert
>
> In the past, you've had a lot of success with coming up with stress
> tests that find bugs. Maybe you can come up with some improvements to
> the suite, if you'd care to test the patch. I can authorize your
> Github account to push code to that repo, if you're interested.
Yeah!

I have just released a prototype software (not related to pg): I'm going 
to tell them to treat it with extreme suspicion, no matter how much they 
may respect the developer (me)!

Though like Pg, it is critical that it records data with reliability. 
Also, both need testing to try and detect intermittent errors (I already 
found one myself in the prototype - fortunately, not so critical it 
needs to be fixed in the prototype, but would have to be eliminated from 
the production version!).

So I think it really great to encourage people to come up with demanding 
tests, especially automated stress testing for pg.

Cheers,
Gavin

(Who wishes he had the time & experience to contribute to pg.)

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Heikki Linnakangas

Date:

17 December 2014, 21:13:01

It looks like we are close to reaching consensus on the syntax. Phew! 
Thanks for maintaining the wiki pages and the documentation. All of the 
below is based on those, I haven't looked at the patch itself yet.

The one thing that I still feel uneasy about is the Unique Index 
Inference thing. Per the syntax example from the wiki page, the UPSERT 
statement looks like this:

INSERT INTO upsert(key, val) VALUES(1, 'insert')   ON CONFLICT (key) IGNORE;

With ON CONFLICT IGNORE, the list of key columns can also be left out:

INSERT INTO upsert(key, val) VALUES(1, 'insert')   ON CONFLICT IGNORE;

The documentation says that:

> Omitting the specification indicates a total indifference to where
> any would-be uniqueness violation could occur, which isn't always
> appropriate; at times, it may be desirable for ON CONFLICT IGNORE to
> not suppress a duplicate violation within an index where that isn't
> explicitly anticipated. Note that ON CONFLICT UPDATE assignment may
> result in a uniqueness violation, just as with a conventional
> UPDATE.

Some questions:

1. Does that mean that if you leave out the key columns, the insertion 
is IGNOREd if it violates *any* unique key constraint?

2. If you do specify the key columns, then the IGNORE path is taken only 
if the insertion violates a unique key constraint on those particular 
columns. Otherwise an error is thrown. Right? Now, let's imagine a table 
like this:

CREATE TABLE persons (  username text unique,  real_name text unique,  data text
);

Is there any way to specify both of those constraints, so that the 
insertion is IGNOREd if it violates either one of them? If you try to do:

INSERT INTO persons(username, real_name, data)
VALUES('foobar', 'foo bar')
ON CONFLICT (username, real_name) IGNORE;

It will fail because there is no unique index on (username, real_name). 
In this particular case, you could leave out the specification, but if 
there was a third constraint that you're not expecting to conflict with, 
you would want violations of that constraint to still throw an error. 
And you can't leave out the specification with ON CONFLICT UPDATE anyway.

3. Why is the specification required with ON CONFLICT UPDATE, but not 
with ON CONFLICT IGNORE?

4. What happens if there are multiple unique indexes with identical 
columns, and you give those columns in the inference specification? 
Doesn't matter which index you use, I guess, if they're all identical, 
but see next question.

5. What if there are multiple unique indexes with the same columns, but 
different operator classes?

6. Why are partial unique indexes not supported as arbitrators?

- Heikki

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Josh Berkus

Date:

17 December 2014, 22:20:09

On 12/17/2014 01:12 PM, Heikki Linnakangas wrote:
> 3. Why is the specification required with ON CONFLICT UPDATE, but not
> with ON CONFLICT IGNORE?

Well, UPDATE has to know which row to lock, no?  IGNORE does not.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

17 December 2014, 23:02:25

On Wed, Dec 17, 2014 at 1:12 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> It looks like we are close to reaching consensus on the syntax. Phew! Thanks
> for maintaining the wiki pages and the documentation. All of the below is
> based on those, I haven't looked at the patch itself yet.

Great, thanks!

Yes, I am relieved that we appeared to have agreed on a syntax.

> The one thing that I still feel uneasy about is the Unique Index Inference
> thing.

> The documentation says that:
>
>> Omitting the specification indicates a total indifference to where
>> any would-be uniqueness violation could occur, which isn't always
>> appropriate;

> Some questions:
>
> 1. Does that mean that if you leave out the key columns, the insertion is
> IGNOREd if it violates *any* unique key constraint?

Yes. This is particularly important for the implementation of things
like IGNORE's updatable view support. More generally, for various ETL
use cases it's possible to imagine the user simply not caring.

> 2. If you do specify the key columns, then the IGNORE path is taken only if
> the insertion violates a unique key constraint on those particular columns.
> Otherwise an error is thrown. Right?

That's right.

> Now, let's imagine a table like this:
>
> CREATE TABLE persons (
>   username text unique,
>   real_name text unique,
>   data text
> );
>
> Is there any way to specify both of those constraints, so that the insertion
> is IGNOREd if it violates either one of them? If you try to do:
>
> INSERT INTO persons(username, real_name, data)
> VALUES('foobar', 'foo bar')
> ON CONFLICT (username, real_name) IGNORE;
>
> It will fail because there is no unique index on (username, real_name). In
> this particular case, you could leave out the specification, but if there
> was a third constraint that you're not expecting to conflict with, you would
> want violations of that constraint to still throw an error. And you can't
> leave out the specification with ON CONFLICT UPDATE anyway.

Good point.

For the IGNORE case: I guess the syntax just isn't that flexible. I
agree that that isn't ideal.

For the UPDATE case: Suppose your example was an UPDATE where we
simply assigned the excluded.data value to the data column in the
auxiliary UPDATE's targetlist. What would the user really be asking
for with that command, at a really high level? It seems like they
might actually want to run two UPSERT commands (one for username, the
other for real_name), or rethink their indexing strategy - in
particular, whether it's appropriate that there isn't a composite
unique constraint on (username, real_name).

Now, suppose that by accident or by convention it will always be
possible for a composite unique index to be built on (username,
real_name) - no dup violations would be raised if it was attempted,
but it just hasn't been and won't be. In other words, it's generally
safe to actually pretend that there is one. Then, surely it doesn't
matter if the user picks one or the other unique index. It'll all work
out when the user assigns to both in the UPDATE targetlist, because of
the assumed convention that I think is implied by the example. If the
convention is violated, at least you get a dup violation letting you
know (iff you bothered to assign). But I wouldn't like to encourage
that pattern.

I think that the long and the short of it is that you really ought to
have one unique index as an arbiter in mind when writing a DML
statement for the UPDATE variant. Relying on this type of convention
is possible, I suppose, but ill-advised.

> 3. Why is the specification required with ON CONFLICT UPDATE, but not with
> ON CONFLICT IGNORE?

That was a fairly recent decision, taken mainly to keep Kevin happy --
although TBH I don't recall that he was particularly insistent on that
restriction. I could still go either way on that question.

The idea, as I mentioned, is that it's legitimate to not care where a
dup violation might occur for certain ETL use cases. For UPSERT,
though, the only argument for not making it mandatory is that that's
something extra to type, and lazy people would prefer not to bother.
This is because we assume that the first dup violation is the only
possible one without the unique index inference clause. If we don't
have the index expressions with which to infer an arbiter unique index
(as with MySQL's ON DUPLICATE KEY UPDATE), you'd better be sure that
you accounted for all possible sources of would-be duplicate
violations - otherwise a random row will be updated!

That isn't a fantastic argument for not making a unique index
inference clause mandatory, but it might be an okay one.

> 4. What happens if there are multiple unique indexes with identical columns,
> and you give those columns in the inference specification? Doesn't matter
> which index you use, I guess, if they're all identical, but see next
> question.

It doesn't really matter which one you pick, but right now, at the
urging of Robert, we cost the list of candidates and pick the cheapest
iff there is more than one [1] (and error if there are none). This is
roughly similar to costing of indexes for CLUSTER, and occurs during
optimization (only parse analysis of the expressions associated with
the unique index inference clause occurs during parse analysis -
indexes are looked up and matched in the optimizer).

> 5. What if there are multiple unique indexes with the same columns, but
> different operator classes?

I thought about that. I am reusing a little bit of the CREATE INDEX
infrastructure for raw parsing, and for a small amount of parse
analysis (conveniently, this makes the command reject things like
aggregate functions with no additional code - the error messages only
mention "index expressions", so I believe that's fine). This could
include an opclass specification, but right now non-default opclasses
are rejected during extra steps in parse analysis, for no particular
reason.

I could easily have the unique index inference specification accept a
named opclass, if you thought that was important, and you thought
naming a non-default opclass by name was a good SQL interface. It
would take only a little effort to support non-default opclasses.

> 6. Why are partial unique indexes not supported as arbitrators?

Robert and I discussed this quite a bit -- it was the argument for
being able to name a unique index by name (not that I'm very happy
with that idea or anything) [2]. Basically, dealing with the possible
behaviors with before row insert triggers might in general greatly
complicate the implementation, even though the issues we'd then be
protected against would seldom arise. Robert seemed to think that we
could revisit this in a future version [3].

Note that IGNORE will still IGNORE any partial unique index -- it just
won't accept one as the sole arbiter of whether or not the IGNORE path
should be taken (so it's really the inference specification syntax
that doesn't accept partial unique indexes, just as it doesn't accept
updatable views, exclusion constraints, and inheritance parents where
the semantics are similarly iffy -- that's both the reason for and the
mechanism by which ON CONFLICT UPDATE does not support these things).

[1] http://www.postgresql.org/message-id/CAM3SWZQz+jYkwfuZvcSf0qtpa2QiY+8NGNcHjfWgz3DDzRfzEg@mail.gmail.com
[2] http://www.postgresql.org/message-id/CAM3SWZQ8tDPdjiwj_FW4AO8gEvpyiixwBE67OVQuufPJ+y1e1g@mail.gmail.com
[3] http://www.postgresql.org/message-id/CA+TgmoZgLgY2PBAMTY3T1jpYXAvNL-w=T6o+6pMqrVR+Vn-iyg@mail.gmail.com
-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Heikki Linnakangas

Date:

18 December 2014, 15:00:23

On 12/18/2014 01:02 AM, Peter Geoghegan wrote:
> On Wed, Dec 17, 2014 at 1:12 PM, Heikki Linnakangas
> <hlinnakangas@vmware.com> wrote:
>> Now, let's imagine a table like this:
>>
>> CREATE TABLE persons (
>>    username text unique,
>>    real_name text unique,
>>    data text
>> );
>>
>> Is there any way to specify both of those constraints, so that the insertion
>> is IGNOREd if it violates either one of them? If you try to do:
>>
>> INSERT INTO persons(username, real_name, data)
>> VALUES('foobar', 'foo bar')
>> ON CONFLICT (username, real_name) IGNORE;
>>
>> It will fail because there is no unique index on (username, real_name). In
>> this particular case, you could leave out the specification, but if there
>> was a third constraint that you're not expecting to conflict with, you would
>> want violations of that constraint to still throw an error. And you can't
>> leave out the specification with ON CONFLICT UPDATE anyway.
>
> Good point.
>
> For the IGNORE case: I guess the syntax just isn't that flexible. I
> agree that that isn't ideal.

It should be simple to allow multiple key specifications:

INSERT INTO persons (username, real_name, data)
VALUES('foobar', 'foo bar')
ON CONFLICT (username), (real_name) IGNORE;

It's a rather niche use case, but might as well support it for the sake 
of completeness.

> For the UPDATE case: Suppose your example was an UPDATE where we
> simply assigned the excluded.data value to the data column in the
> auxiliary UPDATE's targetlist. What would the user really be asking
> for with that command, at a really high level? It seems like they
> might actually want to run two UPSERT commands (one for username, the
> other for real_name), or rethink their indexing strategy - in
> particular, whether it's appropriate that there isn't a composite
> unique constraint on (username, real_name).
>
> Now, suppose that by accident or by convention it will always be
> possible for a composite unique index to be built on (username,
> real_name) - no dup violations would be raised if it was attempted,
> but it just hasn't been and won't be. In other words, it's generally
> safe to actually pretend that there is one. Then, surely it doesn't
> matter if the user picks one or the other unique index. It'll all work
> out when the user assigns to both in the UPDATE targetlist, because of
> the assumed convention that I think is implied by the example. If the
> convention is violated, at least you get a dup violation letting you
> know (iff you bothered to assign). But I wouldn't like to encourage
> that pattern.
>
> I think that the long and the short of it is that you really ought to
> have one unique index as an arbiter in mind when writing a DML
> statement for the UPDATE variant. Relying on this type of convention
> is possible, I suppose, but ill-advised.

Another thought is that you might want to specify a different action 
depending on which constraint is violated:

INSERT INTO persons (username, real_name, data)
VALUES('foobar', 'foo bar')
ON CONFLICT (username) IGNORE
ON CONFLICT (real_name) UPDATE ...;

Although that leaves the question of what to do if both are violated. 
Perhaps:

INSERT INTO persons (username, real_name, data)
VALUES('foobar', 'foo bar')
ON CONFLICT (username, real_name) IGNORE
ON CONFLICT (real_name) UPDATE username = excluded.username;
ON CONFLICT (username) UPDATE real_name = excluded.real_name;

>> 5. What if there are multiple unique indexes with the same columns, but
>> different operator classes?
>
> I thought about that. I am reusing a little bit of the CREATE INDEX
> infrastructure for raw parsing, and for a small amount of parse
> analysis (conveniently, this makes the command reject things like
> aggregate functions with no additional code - the error messages only
> mention "index expressions", so I believe that's fine). This could
> include an opclass specification, but right now non-default opclasses
> are rejected during extra steps in parse analysis, for no particular
> reason.
>
> I could easily have the unique index inference specification accept a
> named opclass, if you thought that was important, and you thought
> naming a non-default opclass by name was a good SQL interface. It
> would take only a little effort to support non-default opclasses.

It's a little weird to mention an opclass by name. It's similar to 
naming an index by name, really. How about naming the operator? For an 
exclusion constraint, that would be natural, as the syntax to create an 
exclusion constraint in the first place is "EXCLUDE USING gist (c WITH &&)"

Naming the index by columns makes sense in most cases, and I don't like 
specifying the index's name, but how about allowing naming a constraint? 
Indexes are just an implementation detail, but constraints are not. 
Unique and exclusion constraints are always backed by an index, so there 
is little difference in practice, but I would feel much more comfortable 
mentioning constraints by name than indexes.

Most people would list the columns, but if there is a really bizarre 
constraint, with non-default opclasses, or an exclusion constraint, it's 
probably been given a name that you could use.

In theory, with the promise tuple approach to locking, you don't 
necessarily even need an index to back up the constraint. You could just 
do a sequential scan of the whole table to see if there are any 
conflicting rows, then insert the row, and perform another scan to see 
if any conflicting rows appeared in the meantime. Performance would 
suck, and there is no guarantee that another backend doesn't do a 
regular INSERT into to the table that violates the imaginary constraint, 
so this is pretty useless in practice. So probably better to not allow it.

- Heikki

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Kevin Grittner

Date:

18 December 2014, 15:46:39

Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

> INSERT INTO persons (username, real_name, data)
> VALUES('foobar', 'foo bar')
> ON CONFLICT (username), (real_name) IGNORE;

> INSERT INTO persons (username, real_name, data)
> VALUES('foobar', 'foo bar')
> ON CONFLICT (username) IGNORE
> ON CONFLICT (real_name) UPDATE ...;

> INSERT INTO persons (username, real_name, data)
> VALUES('foobar', 'foo bar')
> ON CONFLICT (username, real_name) IGNORE
> ON CONFLICT (real_name) UPDATE username = excluded.username;
> ON CONFLICT (username) UPDATE real_name = excluded.real_name;

I like all of these suggestions, except that I think they reflect a
couple things about the syntax which was never settled[1]. First,
Robert suggested using DUPLICATE instead of CONFLICT, which I think
it clearer. So the above would become:

INSERT INTO persons (username, real_name, data)
VALUES('foobar', 'foo bar')
ON DUPLICATE (username), (real_name) IGNORE;

INSERT INTO persons (username, real_name, data)
VALUES('foobar', 'foo bar')
ON DUPLICATE (username) IGNORE
ON DUPLICATE (real_name) UPDATE ...;

INSERT INTO persons (username, real_name, data)
VALUES('foobar', 'foo bar')
ON DUPLICATE (username, real_name) IGNORE
ON DUPLICATE (real_name) UPDATE username = excluded.username;
ON DUPLICATE (username) UPDATE real_name = excluded.real_name;

Second, he suggested a shorthand way of specifying that all the
values from the failed INSERT should be used for the UPDATE:

INSERT INTO persons (username, real_name, data)
VALUES('foobar', 'foo bar', 'baz')
ON DUPLICATE (username) UPDATE;

I think the first point got lost in the discussion of the second
one.

I don't think either point was ever really settled beyond Robert
and I preferring ON DUPLICATE versus Peter preferring ON CONFLICT.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] http://www.postgresql.org/message-id/CA+TgmoZN=2AJKi1n4Jz5BkmYi8r_CPUDW+DtoppmTeLVmsOoqw@mail.gmail.com

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Heikki Linnakangas

Date:

18 December 2014, 15:53:09

On 12/18/2014 05:46 PM, Kevin Grittner wrote:
> I don't think either point was ever really settled beyond Robert
> and I preferring ON DUPLICATE versus Peter preferring ON CONFLICT.

I also prefer ON CONFLICT, because that makes more sense when you 
consider exclusion constraints, which I'm still hoping that this would 
support. If not immediately, at least in the future.

- Heikki

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Kevin Grittner

Date:

18 December 2014, 16:04:57

Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
> On 12/18/2014 05:46 PM, Kevin Grittner wrote:
>> I don't think either point was ever really settled beyond Robert
>> and I preferring ON DUPLICATE versus Peter preferring ON CONFLICT.
>
> I also prefer ON CONFLICT, because that makes more sense when you
> consider exclusion constraints, which I'm still hoping that this would
> support. If not immediately, at least in the future.

If you think this can be made to work without a UNIQUE btree index,
that is a persuasive point in favor of ON CONFLICT.  I had missed
(or forgotten) that we thought this could work without a UNIQUE
btree index as the basis of detecting when to resort to an UPDATE.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Jeff Janes

Date:

18 December 2014, 17:20:46

On Mon, Dec 15, 2014 at 11:06 PM, Peter Geoghegan <pg@heroku.com> wrote:

On Mon, Dec 15, 2014 at 4:59 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Mon, Dec 15, 2014 at 4:22 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>> Also, in both Linux and MinGW under option 1 patch I get an OID conflict on
>> OID 3261.
>
> I'll take a pass at fixing this bitrot soon. I'll follow Tom's advice
> about macro collisions on MinGW while I'm at it, since his explanation
> seems plausible.

Attached pair of revised patch sets fix the OID collision, and
presumably fix the MinGW issue (because IGNORE_P is now used as a
token name). It also polishes approach #2 to value locking in a few
places (e.g. better comments). Finally, both patches have a minor
buglet around EXPLAIN ANALYZE output fixed -- the output now indicates
if tuples are pulled up from auxiliary update nodes.

I naively tried this in vallock1 patch:

create table foo(index int, count int);

create unique index on foo(index);

insert into foo (index, count) values (0,1) on conflict (index) update set count=foo.count + 1 returning foo.count;

After actually reading the documentation more closely, I decided this should be an error because "foo" is not a valid table alias in the "update set" expression. Instead of being a parsing/planning error, this executes and the foo.count on the RHS of the assignment always evaluates as zero (even on subsequent invocations when TARGET.count is 1).

If I switch to a text type, then I get seg faults under the same condition:

create table foo(index int, count text);

create unique index on foo(index);

insert into foo (index, count) values (0,'start ') on conflict (index) update set count=foo.count||' bar' returning count;

<boom>

#0 pg_detoast_datum_packed (datum=0x0) at fmgr.c:2270

#1 0x000000000074fb7a in textcat (fcinfo=0x1e67a78) at varlena.c:662

#2 0x00000000005a63a5 in ExecMakeFunctionResultNoSets (fcache=0x1e67a08, econtext=0x1e67848, isNull=0x1e68b11 "", isDone=<value optimized out>)

at execQual.c:2026

#3 0x00000000005a2353 in ExecTargetList (projInfo=<value optimized out>, isDone=0x7fffa7fa346c) at execQual.c:5358

#4 ExecProject (projInfo=<value optimized out>, isDone=0x7fffa7fa346c) at execQual.c:5573

#5 0x00000000005a86c2 in ExecScan (node=0x1e67738, accessMtd=0x5baf00 <SeqNext>, recheckMtd=0x5bad60 <SeqRecheck>) at execScan.c:207

#6 0x00000000005a1918 in ExecProcNode (node=0x1e67738) at execProcnode.c:406

#7 0x000000000059ef32 in EvalPlanQualNext (epqstate=<value optimized out>) at execMain.c:2380

#8 0x00000000005b8fcd in ExecLockUpdateTuple (node=0x1e5f750) at nodeModifyTable.c:1098

#9 ExecInsert (node=0x1e5f750) at nodeModifyTable.c:372

#10 ExecModifyTable (node=0x1e5f750) at nodeModifyTable.c:1396

#11 0x00000000005a1958 in ExecProcNode (node=0x1e5f750) at execProcnode.c:383

#12 0x00000000005a0642 in ExecutePlan (queryDesc=0x1dd0908, direction=<value optimized out>, count=0) at execMain.c:1515

#13 standard_ExecutorRun (queryDesc=0x1dd0908, direction=<value optimized out>, count=0) at execMain.c:308

#14 0x00007f601416b9cb in pgss_ExecutorRun (queryDesc=0x1dd0908, direction=ForwardScanDirection, count=0) at pg_stat_statements.c:874

#15 0x000000000069385f in ProcessQuery (plan=0x1e47df0,

....

So I think there needs to be some kind of logic to de-recognize the table alias "foo".

Once I rewrote the query to use TARGET and EXCLUDED correctly, I've put this through an adaptation of my usual torture test, and it ran fine until wraparound shutdown. I'll poke at it more later.

Cheers,

Jeff

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

18 December 2014, 17:31:42

On Thu, Dec 18, 2014 at 9:20 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> After actually reading the documentation more closely, I decided this should
> be an error because "foo" is not a valid table alias in the "update set"
> expression.  Instead of being a parsing/planning error, this executes and
> the foo.count on the RHS of the assignment always evaluates as zero (even on
> subsequent invocations when TARGET.count is 1).
>
> If I switch to a text type, then I get seg faults under the same condition:

> So I think there needs to be some kind of logic to de-recognize the table
> alias "foo".
>
> Once I rewrote the query to use TARGET and EXCLUDED correctly, I've put this
> through an adaptation of my usual torture test, and it ran fine until
> wraparound shutdown.  I'll poke at it more later.

Oops. I agree with your diagnosis, and will circle around to fix that
bug in the next revision by, as you say, simply rejecting the query if
it doesn't use the two standard aliases.

Thanks for testing!
-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

18 December 2014, 19:12:50

On Thu, Dec 18, 2014 at 7:51 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> On 12/18/2014 05:46 PM, Kevin Grittner wrote:
>>
>> I don't think either point was ever really settled beyond Robert
>> and I preferring ON DUPLICATE versus Peter preferring ON CONFLICT.
>
>
> I also prefer ON CONFLICT, because that makes more sense when you consider
> exclusion constraints, which I'm still hoping that this would support. If
> not immediately, at least in the future.

This was why I changed the spelling to ON CONFLICT. It also doesn't
hurt that that spelling is dissimilar to MySQL's syntax, IMV, because
there are plenty of things to dislike about ON DUPLICATE KEY UPDATE,
and I think a veneer of compatibility is inappropriate - this syntax
is both considerably more flexible and considerably safer.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

20 December 2014, 01:32:50

On Thu, Dec 18, 2014 at 6:59 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
>> Good point.
>>
>> For the IGNORE case: I guess the syntax just isn't that flexible. I
>> agree that that isn't ideal.
>
>
> It should be simple to allow multiple key specifications:
>
> INSERT INTO persons (username, real_name, data)
> VALUES('foobar', 'foo bar')
> ON CONFLICT (username), (real_name) IGNORE;
>
> It's a rather niche use case, but might as well support it for the sake of
> completeness.

I guess that wouldn't be very hard to implement, and perhaps we should
do so soon. I am reluctant to let scope creep too far, though. As you
mentioned, this is a niche use case.

>> I think that the long and the short of it is that you really ought to
>> have one unique index as an arbiter in mind when writing a DML
>> statement for the UPDATE variant. Relying on this type of convention
>> is possible, I suppose, but ill-advised.
>
> Another thought is that you might want to specify a different action
> depending on which constraint is violated:
>
> INSERT INTO persons (username, real_name, data)
> VALUES('foobar', 'foo bar')
> ON CONFLICT (username) IGNORE
> ON CONFLICT (real_name) UPDATE ...;
>
> Although that leaves the question of what to do if both are violated.
> Perhaps:
>
> INSERT INTO persons (username, real_name, data)
> VALUES('foobar', 'foo bar')
> ON CONFLICT (username, real_name) IGNORE
> ON CONFLICT (real_name) UPDATE username = excluded.username;
> ON CONFLICT (username) UPDATE real_name = excluded.real_name;

I think that there might be a place for that, but I'd particularly
like to avoid figuring this out now - this suggestion is a complicated
new direction for the patch, and it's not as if adding this kind of
flexibility is precluded by not allowing it in the first version - we
won't paint ourselves into a corner by not doing this up front. The
patch is already complicated enough! Users can always have multiple
UPSERT commands, and that might be very close to good enough for a
relatively rare use case like this.

>> I could easily have the unique index inference specification accept a
>> named opclass, if you thought that was important, and you thought
>> naming a non-default opclass by name was a good SQL interface. It
>> would take only a little effort to support non-default opclasses.
>
> It's a little weird to mention an opclass by name. It's similar to naming an
> index by name, really. How about naming the operator? For an exclusion
> constraint, that would be natural, as the syntax to create an exclusion
> constraint in the first place is "EXCLUDE USING gist (c WITH &&)"
>
> Naming the index by columns makes sense in most cases, and I don't like
> specifying the index's name, but how about allowing naming a constraint?
> Indexes are just an implementation detail, but constraints are not. Unique
> and exclusion constraints are always backed by an index, so there is little
> difference in practice, but I would feel much more comfortable mentioning
> constraints by name than indexes.

The main reason for naming a constraint by name in practice will
probably be because there is no better way to deal with partial unique
indexes (which can be quite useful). But partial unique indexes aren't
formally constraints, in that they don't have pg_constraint entries.
So I don't think that that's going to be acceptable, entirely for that
reason. :-(

> Most people would list the columns, but if there is a really bizarre
> constraint, with non-default opclasses, or an exclusion constraint, it's
> probably been given a name that you could use.

What I find curious about the opclass thing is: when do you ever have
an opclass that has a different idea of equality than the default
opclass for the type? In other words, when is B-Tree strategy number 3
not actually '=' in practice, for *any* B-Tree opclass? Certainly, it
doesn't appear to be the case that it isn't so with any shipped
opclasses - the shipped non-default B-Tree opclasses only serve to
provide alternative notions of sort order, and never "equals".

I think that with B-Tree (which is particularly relevant for the
UPDATE variant), it ought to be defined to work with the type's
default opclass "equals" operator, just like GROUP BY and DISTINCT.
Non-default opclass unique indexes work just as well in practice,
unless someone somewhere happens to create an oddball one that doesn't
use '=' as its "equals" operator (while also having '=' as the default
opclass "equals" operator). I am not aware that that leaves any
actually shipped opclass out (and I include our external extension
ecosystem here, although I might be wrong about that part).

> In theory, with the promise tuple approach to locking, you don't necessarily
> even need an index to back up the constraint.

> So probably better to not allow it.

I agree that we definitely want to require that there is an
appropriate index available.

I think we can live without support for partial unique indexes for the
time being. With non-default opclasses effectively handled (by caring
about the "equals" operator only, and acceptable non-default opclass
indexes when that happens to match the default's), and by assuming
that having an INSERT ... ON CONFLICT IGNORE without an inference
specification to find an exclusion constraint is enough, we have
acceptable semantics, IMV. The worst part of that is that partial
unique indexes cannot be used with the ON CONFLICT UPDATE variant, in
my opinion, but Rome wasn't built in a day.

It would be nice to have a way of discriminating against particular
indexes (unique constraint-related, partial unique, or otherwise) for
the IGNORE variant, but I fear that that'll be difficult to figure out
in time. There is no need to address those questions in the first
version, since I don't think we're failing to play nice with another
major feature. We already have something much more flexible than
equivalent features in other major systems here.
-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Martijn van Oosterhout

Date:

20 December 2014, 10:17:17

On Fri, Dec 19, 2014 at 05:32:43PM -0800, Peter Geoghegan wrote:
> > Most people would list the columns, but if there is a really bizarre
> > constraint, with non-default opclasses, or an exclusion constraint, it's
> > probably been given a name that you could use.
>
> What I find curious about the opclass thing is: when do you ever have
> an opclass that has a different idea of equality than the default
> opclass for the type? In other words, when is B-Tree strategy number 3
> not actually '=' in practice, for *any* B-Tree opclass? Certainly, it
> doesn't appear to be the case that it isn't so with any shipped
> opclasses - the shipped non-default B-Tree opclasses only serve to
> provide alternative notions of sort order, and never "equals".

Well, in theory you could build a case insensetive index on a text
column. You could argue that the column should have been defined as
citext in the first place, but it might not for various reasons.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.  -- Arthur Schopenhauer

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

20 December 2014, 21:14:38

On Sat, Dec 20, 2014 at 2:16 AM, Martijn van Oosterhout
<kleptog@svana.org> wrote:
>> What I find curious about the opclass thing is: when do you ever have
>> an opclass that has a different idea of equality than the default
>> opclass for the type? In other words, when is B-Tree strategy number 3
>> not actually '=' in practice, for *any* B-Tree opclass? Certainly, it
>> doesn't appear to be the case that it isn't so with any shipped
>> opclasses - the shipped non-default B-Tree opclasses only serve to
>> provide alternative notions of sort order, and never "equals".
>
> Well, in theory you could build a case insensetive index on a text
> column. You could argue that the column should have been defined as
> citext in the first place, but it might not for various reasons.

That generally works in other systems by having a case-insensitive
collation. I don't know if that implies that non bitwise identical
items can be equal according to the "equals" operator in those other
systems. There aren't too many examples of that happening in general
(I can only think of citext and numeric offhand), presumably because
it necessitates a normalization process (such as lower-casing in the
case of citext) within the hash opclass support function 1, a process
best avoided.

citext is an interesting precedent that supports my argument above,
because citext demonstrates that we preferred to create a new type
rather than a new non-default opclass (with a non-'=' "equals"
operator) when time came to introduce a new concept of "equals" (and
not merely a new, alternative sort order). Again, this is surely due
to the system dependency on the default B-Tree opclass for the
purposes of GROUP BY and DISTINCT, whose behavior sort ordering
doesn't necessarily enter into at all.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

20 December 2014, 21:57:08

On Thu, Dec 18, 2014 at 9:31 AM, Peter Geoghegan <pg@heroku.com> wrote:
>> So I think there needs to be some kind of logic to de-recognize the table
>> alias "foo".
>>
>> Once I rewrote the query to use TARGET and EXCLUDED correctly, I've put this
>> through an adaptation of my usual torture test, and it ran fine until
>> wraparound shutdown.  I'll poke at it more later.
>
> Oops. I agree with your diagnosis, and will circle around to fix that
> bug in the next revision

Attached patch fixes the bug. I'm not delighted about the idea of
cutting off parent parse state (the parse state of the insert) within
transformUpdateStmt() only once we've used the parent state to
establish that this is a "speculative"/auxiliary update, but it's
probably the path of least resistance here.

When this is rolled into the next version, there will be a testcase.

Thanks
--
Peter Geoghegan

Attachment

target_ref_bugfix.patch

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Michael Paquier

Date:

22 December 2014, 02:10:11

On Sun, Dec 21, 2014 at 6:56 AM, Peter Geoghegan <pg@heroku.com> wrote:
> On Thu, Dec 18, 2014 at 9:31 AM, Peter Geoghegan <pg@heroku.com> wrote:
>>> So I think there needs to be some kind of logic to de-recognize the table
>>> alias "foo".
>>>
>>> Once I rewrote the query to use TARGET and EXCLUDED correctly, I've put this
>>> through an adaptation of my usual torture test, and it ran fine until
>>> wraparound shutdown.  I'll poke at it more later.
>>
>> Oops. I agree with your diagnosis, and will circle around to fix that
>> bug in the next revision
>
> Attached patch fixes the bug. I'm not delighted about the idea of
> cutting off parent parse state (the parse state of the insert) within
> transformUpdateStmt() only once we've used the parent state to
> establish that this is a "speculative"/auxiliary update, but it's
> probably the path of least resistance here.
>
> When this is rolled into the next version, there will be a testcase.
Looking at this thread, the last version of this patch is available here:
http://www.postgresql.org/message-id/CAM3SWZRvkCKc=1Y6_Wn8mk97_Vi8+j-aX-RY-=msrJVU-Ec-qw@mail.gmail.com
And they do not apply correctly, so this patch needs a rebase.
Regards,
-- 
Michael

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

22 December 2014, 02:20:27

On Sun, Dec 21, 2014 at 6:10 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> Looking at this thread, the last version of this patch is available here:
> http://www.postgresql.org/message-id/CAM3SWZRvkCKc=1Y6_Wn8mk97_Vi8+j-aX-RY-=msrJVU-Ec-qw@mail.gmail.com
> And they do not apply correctly, so this patch needs a rebase.

That isn't so. The latest version is much more recent than that. It's
available here:

http://www.postgresql.org/message-id/CAM3SWZQTqsCLZ1YJ1OuWFpO-GmFHwtgwTOg+o_NNzxrPa7Cx4A@mail.gmail.com

Everything is tracked in the commitfest app in detail.
-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Michael Paquier

Date:

22 December 2014, 02:34:34

On Mon, Dec 22, 2014 at 11:20 AM, Peter Geoghegan <pg@heroku.com> wrote:
> On Sun, Dec 21, 2014 at 6:10 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> Looking at this thread, the last version of this patch is available here:
>> http://www.postgresql.org/message-id/CAM3SWZRvkCKc=1Y6_Wn8mk97_Vi8+j-aX-RY-=msrJVU-Ec-qw@mail.gmail.com
>> And they do not apply correctly, so this patch needs a rebase.
>
> That isn't so. The latest version is much more recent than that. It's
> available here:
>
> http://www.postgresql.org/message-id/CAM3SWZQTqsCLZ1YJ1OuWFpO-GmFHwtgwTOg+o_NNzxrPa7Cx4A@mail.gmail.com
>
> Everything is tracked in the commitfest app in detail.
Oops, sorry. I got mistaken because of the name of the latest attachments.
-- 
Michael

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Heikki Linnakangas

Date:

22 December 2014, 21:24:30

On 12/20/2014 11:14 PM, Peter Geoghegan wrote:
> On Sat, Dec 20, 2014 at 2:16 AM, Martijn van Oosterhout
> <kleptog@svana.org> wrote:
>>> What I find curious about the opclass thing is: when do you ever have
>>> an opclass that has a different idea of equality than the default
>>> opclass for the type? In other words, when is B-Tree strategy number 3
>>> not actually '=' in practice, for *any* B-Tree opclass? Certainly, it
>>> doesn't appear to be the case that it isn't so with any shipped
>>> opclasses - the shipped non-default B-Tree opclasses only serve to
>>> provide alternative notions of sort order, and never "equals".
>>
>> Well, in theory you could build a case insensetive index on a text
>> column. You could argue that the column should have been defined as
>> citext in the first place, but it might not for various reasons.
>
> That generally works in other systems by having a case-insensitive
> collation. I don't know if that implies that non bitwise identical
> items can be equal according to the "equals" operator in those other
> systems. There aren't too many examples of that happening in general
> (I can only think of citext and numeric offhand), presumably because
> it necessitates a normalization process (such as lower-casing in the
> case of citext) within the hash opclass support function 1, a process
> best avoided.
>
> citext is an interesting precedent that supports my argument above,
> because citext demonstrates that we preferred to create a new type
> rather than a new non-default opclass (with a non-'=' "equals"
> operator) when time came to introduce a new concept of "equals" (and
> not merely a new, alternative sort order). Again, this is surely due
> to the system dependency on the default B-Tree opclass for the
> purposes of GROUP BY and DISTINCT, whose behavior sort ordering
> doesn't necessarily enter into at all.

Yeah, I don't expect it to happen very often. It's confusing to have 
multiple definitions of equality.

There is one built-in example: the "record *= record" operator [1]. It's 
quite special purpose, the docs even say that they "are not intended to 
be generally useful for writing queries". But there they are.

I feel that it needs to be possible to specify the constraint 
unambiguously in all cases. These are very rare use cases, but we should 
have an escape hatch for the rare cases that need it.


What would it take to also support partial indexes?

[1] See 
http://www.postgresql.org/docs/devel/static/functions-comparisons.html#ROW-WISE-COMPARISON

- Heikki

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

22 December 2014, 22:04:29

On Mon, Dec 22, 2014 at 1:24 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> I feel that it needs to be possible to specify the constraint unambiguously
> in all cases. These are very rare use cases, but we should have an escape
> hatch for the rare cases that need it.
>
>
> What would it take to also support partial indexes?

Aside from considerations about how to pick them without using their
name, partial unique indexes aren't special at all. My earlier concern
was that we'd need to account for before row insert triggers that
change values out from under us. But maybe that concern was overblown,
come to think of it. I am already borrowing a little bit of the raw
parser's logic for CREATE INDEX statements for unique index inference
(during parse analysis) -- we're matching the cataloged index
definition attributes/expressions, so this makes a lot of sense. Maybe
I had the wrong idea about partial indexes earlier, which was that we
must use the values in the tuple proposed for insertion to check that
a partial index was a suitable arbiter of whether or not the UPDATE
path should be taken in respect of any given tuple. I should just go
further with borrowing things from CREATE INDEX, and give the user an
optional way of specifying a WHERE clause that is also matched in a
similar way to the expressions themselves. Did the partial unique
index your UPSERT implied not cover the ultimate tuple inserted after
before row insert triggers fired? That's on you as a user...you'll
always get an insert, since there won't be a would-be duplicate
violation to make there be an update.

I actually care about partial unique indexes a lot. They're a very
useful feature. Back when I was an application developer, I frequently
used "is_active" boolean columns to represent "logical app-level
deletion", where actually deleting the tuple was not possible (e.g.
because it may still be referenced in historic records), while not
wanting to have it be subject to uniqueness checks as a logically
deleted/!is_active tuple.

This measure to support partial indexes, plus the additional leeway
around non-default opclass unique indexes that I can add (that they
need only match the "equals" operator of the default opclass to be
accepted) brings us 99.9% of the way. That only leaves:

* An inability to specifying some subset of unique indexes or
exclusion constraints for the IGNORE variant (the UPDATE variant is
irrelevant).

* An inability to specifying a IGNORE arbitrating *exclusion
constraint* as the sole arbiter of whether or not the IGNORE path
should be taken. (exclusion constraints are not usable for the UPDATE
variant, so that's irrelevant again).

Did I forget something?

The use cases around these limitations are very rare, and only apply
to the IGNORE variant which seems much less interesting. I'm quite
comfortable dealing with them in a later release of PostgreSQL, to cut
scope (or avoid adding scope) for 9.5. Do you think that's okay? How
often will the IGNORE variant be used when everything shouldn't be
IGNOREd anyway? Although, to be totally fair, I should probably also
include:

* non-default B-tree opclasses cannot be specified as arbiters of the
alternative path (for both IGNORE and UPDATE variants) iff their
"equals" operator happens to not be the "equals" operator of the
default opclass (which is theoretical, and likely non-existent as a
use case).

If you're dead set on having an escape hatch, maybe we should just get
over it and add a way of specifying a unique index by name. As I said,
these under-served use cases are either exceedingly rare or entirely
theoretical.
-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Robert Haas

Date:

23 December 2014, 13:46:50

On Mon, Dec 22, 2014 at 5:04 PM, Peter Geoghegan <pg@heroku.com> wrote:
> If you're dead set on having an escape hatch, maybe we should just get
> over it and add a way of specifying a unique index by name. As I said,
> these under-served use cases are either exceedingly rare or entirely
> theoretical.

I'm decidedly unenthusiastic about that.  People don't expect CREATE
INDEX CONCURRENTLY + DROP INDEX CONCURRENTLY to break their DML.  I
think the solution in this case would be a gateway to problems larger
than the one we're trying to solve.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

23 December 2014, 19:30:24

On Tue, Dec 23, 2014 at 5:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Dec 22, 2014 at 5:04 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> If you're dead set on having an escape hatch, maybe we should just get
>> over it and add a way of specifying a unique index by name. As I said,
>> these under-served use cases are either exceedingly rare or entirely
>> theoretical.
>
> I'm decidedly unenthusiastic about that.  People don't expect CREATE
> INDEX CONCURRENTLY + DROP INDEX CONCURRENTLY to break their DML.  I
> think the solution in this case would be a gateway to problems larger
> than the one we're trying to solve.

I tend to agree. I think we should just live with the fact that not
every conceivable use case will be covered, at least initially. Then,
if an appreciable demand for even more flexibility emerges, we can
revisit this. We already have a syntax that is significantly more
flexible than the equivalent feature in any other system. Let's not
lose sight of that.

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

23 December 2014, 19:34:57

On Tue, Dec 23, 2014 at 11:30 AM, Peter Geoghegan <pg@heroku.com> wrote:
> I tend to agree. I think we should just live with the fact that not
> every conceivable use case will be covered, at least initially.

To be clear: I still think I should go and make the changes that will
make the feature play nice with all shipped non-default B-Tree
operator classes, and will make it work with partial unique indexes
[1]. That isn't difficult or controversial, AFAICT, and gets us very
close to satisfying every conceivable use case.

[1] http://www.postgresql.org/message-id/CAM3SWZQdv7GDLwPRv7=rE-gG1QjLOOL3vCmAriCBcTYk8GwqKw@mail.gmail.com
-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Stephen Frost

Date:

23 December 2014, 19:36:43

* Peter Geoghegan (pg@heroku.com) wrote:
> On Tue, Dec 23, 2014 at 5:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> > On Mon, Dec 22, 2014 at 5:04 PM, Peter Geoghegan <pg@heroku.com> wrote:
> >> If you're dead set on having an escape hatch, maybe we should just get
> >> over it and add a way of specifying a unique index by name. As I said,
> >> these under-served use cases are either exceedingly rare or entirely
> >> theoretical.
> >
> > I'm decidedly unenthusiastic about that.  People don't expect CREATE
> > INDEX CONCURRENTLY + DROP INDEX CONCURRENTLY to break their DML.  I
> > think the solution in this case would be a gateway to problems larger
> > than the one we're trying to solve.
>
> I tend to agree. I think we should just live with the fact that not
> every conceivable use case will be covered, at least initially. Then,
> if an appreciable demand for even more flexibility emerges, we can
> revisit this. We already have a syntax that is significantly more
> flexible than the equivalent feature in any other system. Let's not
> lose sight of that.

+1
Thanks,
    Stephen

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

23 December 2014, 19:56:06

On Thu, Dec 18, 2014 at 9:20 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> I've put this through an adaptation of my usual torture test, and it ran
> fine until wraparound shutdown.  I'll poke at it more later.

Could you elaborate, please? What are the details of the torture test
you're performing?

-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

27 December 2014, 00:23:00

On Fri, Dec 19, 2014 at 5:32 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> Most people would list the columns, but if there is a really bizarre
>> constraint, with non-default opclasses, or an exclusion constraint, it's
>> probably been given a name that you could use.
>
> What I find curious about the opclass thing is: when do you ever have
> an opclass that has a different idea of equality than the default
> opclass for the type? In other words, when is B-Tree strategy number 3
> not actually '=' in practice, for *any* B-Tree opclass? Certainly, it
> doesn't appear to be the case that it isn't so with any shipped
> opclasses - the shipped non-default B-Tree opclasses only serve to
> provide alternative notions of sort order, and never "equals".
>
> I think that with B-Tree (which is particularly relevant for the
> UPDATE variant), it ought to be defined to work with the type's
> default opclass "equals" operator, just like GROUP BY and DISTINCT.
> Non-default opclass unique indexes work just as well in practice,
> unless someone somewhere happens to create an oddball one that doesn't
> use '=' as its "equals" operator (while also having '=' as the default
> opclass "equals" operator). I am not aware that that leaves any
> actually shipped opclass out (and I include our external extension
> ecosystem here, although I might be wrong about that part).

So looking at the way the system deals with its dependence on default
operator classes, I have a hard time justifying all this extra
overhead for the common case. The optimizer will refuse to use an
index with a non-default opclass even when AFAICT there is no *real*
semantic dependence on anything other than the "equals" operator,
which seems to always match across a type's opclasses anyway. e.g.,
DISTINCT will only use a non-default opclass B-Tree index, even though
in practice the "equals" operator always matches for shipped
non-default opclasses; DISTINCT will not work with a text_pattern_ops
index, while it will work with a default text B-Tree opclass index,
*even though no corresponding "ORDER BY" was given*.

Someone recently pointed out in a dedicated thread that the system
isn't all that bright about exploiting the fact that group aggregates
don't necessarily need to care about facets of sort ordering like
collations, which have additional overhead [1]. That might be a useful
special case to target (to make underlying sorts faster), but the big
picture is that the system doesn't know when it only needs to care
about an "equals" operator matching some particular
B-Tree-opclass-defined notion of sorting, rather than caring about a
variety of operators matching. Sometimes, having a matching "equals"
operator of some non-default opclass is good enough to make an index
(or sort scheme) of that opclass usable for some purpose that only
involves equality, and not sort order (like DISTINCT, with no ORDER
BY, executed using a GroupAggregate, for example).

I thought we should formalize the idea that a non-default opclass must
have the same notion of equality (the same "equals" operator) as its
corresponding default opclass, if any. That way, presumably the
optimizer has license to be clever about only caring about
"DISTINCTness"/equality. That also gives my implementation license to
not care about which operator class a unique index uses -- it must not
matter.

Heikki pointed out that there is one shipped opclass that has an
"equals" operator that happens to not be spelt "=" [2] (and
furthermore, does not match that of the default opclass). That's the
record_image_ops opclass, which unusually has an "equals" operator of
"*=". So as Heikki pointed out, it looks like there is some limited
precedent for having to worry about B-Tree opclasses that introduce
alternative notions of "equals", rather than merely alternative
notions of sort order. So so much for formalizing that all of a type's
B-Tree opclass "equals" operators must match...

...having thought about it for a while more, though, I think we should
*still* ignore opclass for the purposes of unique index inference. The
implementation doesn't care about the fact that you used a non-default
opclass. Sure, in theory that could lead to inconsistencies, if there
was multiple unique indexes of multiple opclasses that just so
happened to have incompatible ideas about equality, but that seems
ludicrous...we have only one extremely narrow example of how that
could happen. Plus there'd have to be *both* unique indexes defined
and available for us to infer as appropriate, before the inference
logic could accidentally infer the wrong idea of equality. That seems
like an extremely implausible scenario. Even if we allow for the idea
that alternative notions of equality are something that will happen in
the wild, obviously the user cares about the definition of equality
that they actually used for the unique index in question.

We can document that unique index inference doesn't care about
opclasses (recall that I still only plan on letting users infer a
B-Tree unique index), which is thought to almost certainly not matter.
I think that ought to be fine. In the next revision of UPSERT, the
implementation formally won't care about the opclass of an index when
inferring a unique index to use as an arbiter of whether to take the
alternative IGNORE/UPDATE path. That's formally left undefined.

As already discussed before, I will still proceed with allowing the
user to pick a partial unique index when writing a unique index
inference specification.

[1] http://www.postgresql.org/message-id/CAFjtmHU3Obf5aSpWY7i18diapvjg-418hYySdqUuYhXZtjChhg@mail.gmail.com
[2] http://www.postgresql.org/message-id/54988BF5.9000405@vmware.com
-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Jeff Janes

Date:

27 December 2014, 19:48:10

On Tue, Dec 23, 2014 at 11:55 AM, Peter Geoghegan <pg@heroku.com> wrote:

On Thu, Dec 18, 2014 at 9:20 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> I've put this through an adaptation of my usual torture test, and it ran
> fine until wraparound shutdown. I'll poke at it more later.

Could you elaborate, please? What are the details of the torture test
you're performing?

I've uploaded it here.

https://drive.google.com/folderview?id=0Bzqrh1SO9FcEZ3plX0l5RWNXd00&usp=sharing

The gist of it is that I increment a count column of a random row (via pk) in multiple connections simultaneously.

When the server crashes, or it gets to a certain number of increments, the threads report their activity up to the parent, which then waits for automatic recovery and compares the state of the database to the reported state of the children threads.

That is for my original code. For this purpose, I made the count go either up or down randomly, and when a row's count passes through zero it gets deleted. Then when it is chosen for increment/decrement again, it has to be inserted. I've made this happen either through a update-or-insert-or-retry loop (two variants) or by using your new syntax.

There is a patch which adds a simulation for a torn-page-write followed by a crash, and also adds some elogs that I've sometimes found useful for tracking down problems, with new GUCs to control them.

I don't think you made changes to the WAL/recovery routines, so I don't expect crashing recovery to be a big hazard for your patch, but I wanted to run a test where I was generally familiar with the framework, and thought an independently derived test might exercise some new aspects.

The one thing I noticed is that using your syntax starts out slightly slower than the retry loop, but then gets much slower (down by 2 or 3 times) after a while. It might be a vacuuming issue. The constant intentional crashes interferes with good vacuuming behavior, and I need to retest this with the intentional crashes turned off to see if that fixes it. I'm having difficult access to my usual testing hardware over the holidays, so I'm not getting as much done as I hoped.

I'll try to look at your own stress tests on github as well.

Cheers,

Jeff

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

27 December 2014, 20:57:22

On Sat, Dec 27, 2014 at 11:48 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
>> Could you elaborate, please? What are the details of the torture test
>> you're performing?

> The gist of it is that I increment a count column of a random row (via pk)
> in multiple connections simultaneously.

This is great. In general, I strongly believe that we should be doing
this kind of thing more formally and more frequently. Thanks!

> That is for my original code.  For this purpose, I made the count go either
> up or down randomly, and when a row's count passes through zero it gets
> deleted.  Then when it is chosen for increment/decrement again, it has to be
> inserted.  I've made this happen either through a update-or-insert-or-retry
> loop (two variants) or by using your new syntax.

Did you continue to limit your investigation to value locking approach
#1? I think that #2 is the more likely candidate for commit, that we
should focus on. However, #1 is more "conceptually pure", and is
therefore an interesting basis of comparison with #2 when doing this
kind of testing.

> There is a patch which adds a simulation for a torn-page-write followed by a
> crash, and also adds some elogs that I've sometimes found useful for
> tracking down problems, with new GUCs to control them.

Cool.

> I don't think you made changes to the WAL/recovery routines, so I don't
> expect crashing recovery to be a big hazard for your patch, but I wanted to
> run a test where I was generally familiar with the framework, and thought an
> independently derived test might exercise some new aspects.

Value locking approach #2 does touch crash recovery. Value locking
approach #1 does not.

I certainly see the logic in starting with independently derived
tests. We all have our blind spots.

> The one thing I noticed is that using your syntax starts out slightly slower
> than the retry loop, but then gets much slower (down by 2 or 3 times) after
> a while.  It might be a vacuuming issue.

Interesting. I'd like to compare both approaches to value locking here.

> I'll try to look at your own stress tests on github as well.

Would you be opposed to merging your custom stress-test suite into my
git repo? I'll give you the ability to push to it.

I can help you out if you think you'd benefit from access to my
Quad-core server (Intel Core i7-4770) for stress-testing. I'll
coordinate with you about it privately.
-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

28 December 2014, 23:19:25

On Fri, Dec 26, 2014 at 4:22 PM, Peter Geoghegan <pg@heroku.com> wrote:
> So looking at the way the system deals with its dependence on default
> operator classes, I have a hard time justifying all this extra
> overhead for the common case.

Attached pair of revised patch sets, V1.8:

* Explicitly leaves undefined what happens when a non-default opclass
index *with an alternative notion of not just sort order, but
equality* exists. In practice it depends on the available unique
indexes. I really found it impossible to justify imposing any
restriction here, given the total lack of a scenario in which this
even *could* matter, let alone will. This is a minor wart, but I think
it's acceptable.

* Allows "unique index inference specification" clause to have a WHERE
clause (this is distinct from the WHERE clause that might also appear
in the UPDATE auxiliary query). This can be used to infer partial
unique indexes. I really didn't want to give up support for partial
indexes with the UPDATE variant (recall that the UPDATE variant
*requires* an inference clause), since partial unique indexes are
particularly useful.

Note that the unique index must actually cover the tuple at insert
time, or an error is raised. An example of this that appears in the
regression tests is:

insert into insertconflicttest values (23, 'Uncovered by Index') on
conflict (key where fruit like '%berry') ignore;
ERROR:  partial arbiter unique index has predicate that does not cover
tuple proposed for insertion
DETAIL:  ON CONFLICT inference clause implies that the tuple proposed
for insertion actually be covered by partial predicate for index
"partial_key_index".
HINT:  ON CONFLICT inference clause must infer a unique index that
covers the final tuple, after BEFORE ROW INSERT triggers fire.

* New documentation reflecting the above. A couple of paragraphs in
the INSERT SQL reference page now covers these topics.

* Fix Jeff Jane's bug by added sanitizing code [1]. Certain illegal
queries now correctly rejected during parse analysis.

* Fixed another tiny buglet in EXPLAIN ANALYZE output with a RETURNING
clause, by making sure auxiliary query plan from update also has
plan-level targetlist set.

* Minor clean-up to code comments here and there (in particular, for
the ExcludedExpr primnode used to implement the EXCLUDED.*
pseudo-alias thing).

* Better serialization failure error messages.

I recommend looking at my mirror of the modified documentation:
http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html
to get up to speed on how unique index inference specification clause
have been extended to support partial unique indexes. As I mentioned,
apart from that, the INSERT SQL reference page now covers the
definition of a "CONFLICT"/opclass semantics issues.

I really hope that this deals with all semantics/syntax related loose
ends, allowing discussion of this patch to take a more low-level
focus, which is what is really needed. I feel that further
improvements may be possible, and that the syntax can be even more
flexible, but it's already flexible enough for our first iteration of
this feature. Importantly, we have something that is enormously more
flexible than any equivalent feature in any other system, which
includes the flexibility to extend the syntax in various other
directions (e.g. specifying particular exclusion constraints).

[1] http://archives.postgresql.org/message-id/CAM3SWZT=HptrGyihZiyT39sPBhp+CXOTW=MhNFzXiLf-Jh4QVA@mail.gmail.com
--
Peter Geoghegan

Attachment

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Jeff Janes

Date:

29 December 2014, 22:29:40

On Sun, Dec 28, 2014 at 3:19 PM, Peter Geoghegan <pg@heroku.com> wrote:

On Fri, Dec 26, 2014 at 4:22 PM, Peter Geoghegan <pg@heroku.com> wrote:
> So looking at the way the system deals with its dependence on default
> operator classes, I have a hard time justifying all this extra
> overhead for the common case.

Attached pair of revised patch sets, V1.8:

Hi Peter,

Using the vallock2 version of V1.8, using the test I previously described, I get some all-null rows, which my code should never create. Also, the index and table don't agree, in this example I find 3 all-null rows in the table, but only 2 in the index. I've attached an example output of querying via index and via full table scan, and also the pageinspect output of the blocks which have the 3 rows, in case that is helpful.

This was just a straight forward issue of firing queries at the database, the crash-inducing part of my test harness was not active during this test. I also ran it with my crashing patch reversed out, in case I introduced the problem myself, and it still occurs.

Using V1.7 of the vallock2 patch, I saw the same thing with some all-null rows. I also saw some other issues where two rows with the same key value would be present twice in the table (violating the unique constraint) but only one of them would appear in the index. I suspect it is caused by the same issue as the all-null rows, and maybe I just didn't run v1.8 enough times to find that particular manifestation under v1.8.

Cheers,

Jeff

Attachment

output.sql

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

29 December 2014, 23:10:53

Hi Jeff,

On Mon, Dec 29, 2014 at 2:29 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> Using the vallock2 version of V1.8, using the test I previously described, I
> get some all-null rows, which my code should never create.  Also, the index
> and table don't agree, in this example I find 3 all-null rows in the table,
> but only 2 in the index.  I've attached an example output of querying via
> index and via full table scan, and also the pageinspect output of the blocks
> which have the 3 rows, in case that is helpful.

Interesting. Thanks a lot for your help!

> This was just a straight forward issue of firing queries at the database,
> the crash-inducing part of my test harness was not active during this test.
> I also ran it with my crashing patch reversed out, in case I introduced the
> problem myself, and it still occurs.
>
> Using V1.7 of the vallock2 patch, I saw the same thing with some all-null
> rows.  I also saw some other issues where two rows with the same key value
> would be present twice in the table (violating the unique constraint) but
> only one of them would appear in the index.  I suspect it is caused by the
> same issue as the all-null rows, and maybe I just didn't run v1.8 enough
> times to find that particular manifestation under v1.8.

This is almost certainly a latent bug with approach #2 to value
locking, that has probably been there all along. Semantics and syntax
have been a recent focus, and so the probability that I introduced a
regression of this nature in any recent revision seems low. I am going
to investigate the problem, and hope to have a diagnosis soon.

Once again, thanks!
-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

30 December 2014, 02:10:54

On Mon, Dec 29, 2014 at 2:29 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> I've attached an example output of querying via index and via full table
> scan, and also the pageinspect output of the blocks which have the 3 rows,
> in case that is helpful.

You might have also included output from pageinspect's bt_page_items()
function. Take a look at the documentation patch I just posted if the
details are unclear.

Thanks
-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

30 December 2014, 05:13:03

On Mon, Dec 29, 2014 at 2:29 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> Using the vallock2 version of V1.8, using the test I previously described, I
> get some all-null rows, which my code should never create.  Also, the index
> and table don't agree, in this example I find 3 all-null rows in the table,
> but only 2 in the index.

Just to be clear: You haven't found any such issue with approach #1 to
value locking, right?

I'm curious about how long it took you to see the issue with #2. Were
there any special steps? What were the exact steps involved in turning
off the hard crash mechanism you mention? It looks like the condition
you describe ought to be highlighted by the script automatically. Is
that right? (I don't know any Perl and the script isn't really
documented at a high level).

Thanks
-- 
Peter Geoghegan

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Jeff Janes

Date:

30 December 2014, 07:52:56

On Mon, Dec 29, 2014 at 9:12 PM, Peter Geoghegan <pg@heroku.com> wrote:

On Mon, Dec 29, 2014 at 2:29 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> Using the vallock2 version of V1.8, using the test I previously described, I
> get some all-null rows, which my code should never create. Also, the index
> and table don't agree, in this example I find 3 all-null rows in the table,
> but only 2 in the index.

Just to be clear: You haven't found any such issue with approach #1 to
value locking, right?

Correct, I haven't seen any problems with approach #1

I'm curious about how long it took you to see the issue with #2. Were
there any special steps? What were the exact steps involved in turning
off the hard crash mechanism you mention?

Generally the problem will occur early on in the process, and if not then it will not occur at all. I think that is because the table starts out empty, and so a lot of insertions collide with each other. Once the table is more thoroughly populated, most query takes the CONFLICT branch and therefore two insertion-branches are unlikely to collide.

At its simplest, I just use the count_upsert.pl script and your patch and forget all the rest of the stuff from my test platform.

So:

pg_ctl stop -D /tmp/data2; rm /tmp/data2 -r;

../torn_bisect/bin/pg_ctl initdb -D /tmp/data2;

../torn_bisect/bin/pg_ctl start -D /tmp/data2 -o "--fsync=off" -w ;

createdb;

perl count_upsert.pl 8 100000

A run of count_upsert.pl 8 100000 takes about 30 seconds on my machine (8 core), and if it doesn't create a problem then I just destroy the database and start over.

The fsync=off is not important, I've seen the problem once without it. I just include it because otherwise the run takes a lot longer.

I've attached another version of the count_upsert.pl script, with some more logging targeted to this particular issue.

The problem shows up like this:

init done at count_upsert.pl line 97.

sum is 1036

count is 9720

seq scan doesn't match index scan 1535 == 1535 and 1 == 6 $VAR1 = [

[

6535,

-21

.....

(Thousands of more lines, as it outputs the entire table twice, once gathered by seq scan, once by bitmap index scan).

The first three lines are normal, the problem starts with the "seq scan doesn't match"...

In this case the first problem it ran into was that key 1535 was present once with a count column of 1 (found by seq scan) and once with a count column of 6 (found by index scan). It was also in the seq scan with a count of 6, but the way the comparison works is that it sorts each representation of the table by the key column value and then stops at the first difference, in this case count columns 1 == 6 failed the assertion.

If you get some all-NULL rows, then you will also get Perl warnings issued when the RETURNING clause starts returning NULL when none are expected to be.

The overall pattern seems to be pretty streaky. It could go 20 iterations with no problem, and then it will fail several times in a row. I've seen this pattern quite a bit with other race conditions as well, I think that they may be sensitive to how memory gets laid out between CPUs, and that might depend on some longer-term characteristic of the state of the machine that survives an initdb.

By the way, I also got a new error message a few times that I think might be a manifestation of the same thing:

ERROR: duplicate key value violates unique constraint "foo_index_idx"

DETAIL: Key (index)=(6106) already exists.

STATEMENT: insert into foo (index, count) values ($2,$1) on conflict (index)

update set count=TARGET.count + EXCLUDED.count returning foo.count

Cheers,

Jeff

Attachment

count_upsert.pl

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From

Peter Geoghegan

Date:

31 December 2014, 02:17:55

On Mon, Dec 29, 2014 at 11:52 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> Correct, I haven't seen any problems with approach #1

That helps with debugging #2, then. That's very helpful.

> Generally the problem will occur early on in the process, and if not then it
> will not occur at all.  I think that is because the table starts out empty,
> and so a lot of insertions collide with each other.  Once the table is more
> thoroughly populated, most query takes the CONFLICT branch and therefore two
> insertion-branches are unlikely to collide.
>
> At its simplest, I just use the count_upsert.pl script and your patch and
> forget all the rest of the stuff from my test platform.

I can reproduce this on my laptop now. I think that building at -O2
and without assertions helps. I'm starting to work through debugging
it.

I threw together a quick script for getting pg_xlogdump into a
Postgres table (a nice use of the new pg_lsn type). It's here:

https://github.com/petergeoghegan/jjanes_upsert/blob/master/pg_xlogdump2csv.py

It tells a story. Looking at the last segment before shutdown when the
problem occurred, I see:

postgres=# select count(*), tx from my_xlogdump group by tx having
count(*) > 4 order by 1;count |   tx
-------+---------    5 | 1917836    5 | 1902576    5 | 1909746    5 | 1901586    5 | 1916971    6 | 1870077   39 |
1918004 119 | 1918003 2246 |       0

(9 rows)

postgres=# select max(tx::text::int4) from my_xlogdump ;  max
---------1918004
(1 row)

So the last two transactions (1918003 and 1918004) get into some kind
of live-lock situation, it looks like. Or at least something that
causes them to produce significant more WAL records than other xacts
due to some kind of persistent problem with conflicts.

Here is where the earlier of the two problematic transactions has its
first record:

postgres=# select * from my_xlogdump where tx = '1918003' order by
r_lsn asc limit 1;rmgr | len_rec | len_tot |   tx    |   r_lsn    |  prev_lsn  |              descr
------+---------+---------+---------+------------+------------+----------------------------------------------------Heap
|      3 |     203 | 1918003 | 0/1783BB70 | 0/1783BB48 | INSERT

off 33 blkref #0: rel 1663/16471/12502 blk
(1 row)

After and including that record, until the trouble spot up to and
including shutdown, here is the rmgr breakdown:

postgres=# select count(*),  rmgr from my_xlogdump where r_lsn >=
'0/1783BB70' group by rmgr order by 1;count |    rmgr
-------+-------------    1 | XLOG         -- 1 CHECKPOINT_SHUTDOWN record    2 | Transaction -- commit records for the
twoxacts   20 | Heap2        -- all are CLEAN remxid records, tx is 0   76 | Heap          -- All from our two xacts...
 80 | Btree          -- All from XID 1918003 only

(5 rows)

So looks like a bad interaction with VACUUM. Maybe it's a problem with
VACUUM interlocking. That was my first suspicion, FWIW.

I'll need to do more investigating, but I can provide a custom format
dump of the table, in case anyone wants to look at what I have here in
detail. I've uploaded it to:

http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/files/my_xlogdump.custom.dump.table
-- 
Peter Geoghegan