Thread: INSERT ... ON CONFLICT {UPDATE | IGNORE}

INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
Attached WIP patch extends the INSERT statement, adding a new ON
CONFLICT {UPDATE | IGNORE} clause. This allows INSERT statements to
perform UPSERT operations (if you want a more formal definition of
UPSERT, I refer you to my pgCon talk's slides [1], or the thread in
which I delineated the differences between SQL MERGE and UPSERT [2]).
The patch builds on previous work in this area, and incorporates
feedback from Kevin and Andres.

Overview
=======

Example usage:

INSERT INTO upsert(key, val) VALUES(1, 'insert') ON CONFLICT UPDATE
SET val = 'update';

Essentially, the implementation has all stages of query processing
track some auxiliary UPDATE state. So, for example, during parse
analysis, UPDATE transformation occurs in an ad-hoc fashion tightly
driven by the parent INSERT, but using the existing infrastructure
(i.e. transformStmt()/transformUpdateStmt() is called, and is
insulated from having to care about the feature as a special case).
There are some restrictions on what this auxiliary update may do, but
FWIW there are considerably fewer than those that the equivalent MySQL
or SQLite feature imposes on their users. All of the following SQL
queries are valid with the patch applied:

-- Nesting within wCTE:
WITH t AS (
    INSERT INTO z SELECT i, 'insert'
    FROM generate_series(0, 16) i
    ON CONFLICT UPDATE SET v = v || 'update' -- use of
operators/functions in targetlist
    RETURNING * -- only projects inserted tuples, never updated tuples
)
SELECT * FROM t JOIN y ON t.k = y.a ORDER BY a, k;

-- IGNORE variant:
INSERT INTO upsert(key, val) VALUES(1, 'insert') ON CONFLICT IGNORE;

-- predicate within UPDATE auxiliary statement (row is still locked
when the UPDATE predicate isn't satisfied):
INSERT INTO upsert(key, val) VALUES(1, 'insert') ON CONFLICT UPDATE
WHERE val != 'delete';

As with SQL MERGE (at least as implemented in other systems),
subqueries may not appear within the UPDATE's targetlist, nor may they
appear within the special WHERE clause. But the "INSERT part" of the
query has no additional limitations, so you may for example put
subqueries within a VALUES() clause, or INSERT...SELECT...ON CONFLICT
UPDATE... just as you'd expect. INSERT has been augmented with a new
clause, but that clause does not unreasonably fail to play nice with
any other aspect of insertion. (Actually, that isn't quite true, since
at least for now table inheritance, updatable views and foreign tables
are unsupported. This can be revisited.)

I think that without initially realizing it, I copied the SQLite
syntax [3]. However, unlike with that SQLite feature, CONFLICT only
refers to a would-be duplicate violation, and not a violation of any
other kind of constraint.

How this new approach works (Executor and Optimizer stuff)
============================================

During the execution of the parent ModifyTable, a special auxiliary
subquery (the UPDATE ModifyTable) is considered as a special case.
This is not a subplan of the ModifyTable node in the conventional
sense, and so does not appear within EXPLAIN output. However, it is
more or less independently planned, and entirely driven by the INSERT
ModifyTable. ExecModifyTable() is never called with this special
auxiliary plan state passed directly. Rather, its parent manages the
process as the need arises. ExecLockUpdateTuple() locks and
potentially updates tuples, using the EvalPlanQual() mechanism (even
at higher isolation levels, with appropriate precautions).

The per-tuple expression context of the auxiliary query/plan is used
with EvalPlanQual() from within ExecLockUpdateTuple() (the new routine
tasked with locking and updating on conflict). There is a new
ExecUpdate() call site within ExecLockUpdateTuple(). Given the
restrictions necessarily imposed on this pseudo-rescanning
(principally the outright rejection of anything that necessitates
PARAM_EXEC parameters during planning), this is safe, as far as I'm
aware. It is convenient to be able to re-use infrastructure in such a
way as to more or less handle the UPDATE independently, driven by the
INSERT, except for execution which is more directly handled by the
INSERT (i.e. there is no ExecModifyTable() call in respect of this new
auxiliary ModifyTable plan). Granted, it is kind of bizarre that the
auxiliary query may have a more complex plan than is necessary for our
purposes, but it doesn't actually appear to be a problem when
"rescanning" (Like a SELECT FOR UPDATE/FOR SHARE's node, we call
EvalPlanQualSetTuple() directly). It is likely worthwhile to teach the
optimizer that we really don't care about how the one and only base
rel within the UPDATE auxiliary subquery (the target table) is
scanned, if only to save a few cycles. I have (temporarily) hacked the
optimizer to prevent index-only scans, which are problematic here, by
adding disable_cost when a query parse tree that uses the feature is
seen. Although what I've done is a temporary kludge, the basic idea of
forcing a particular type of relation scan has a precedent: UPDATE
WHERE CURRENT OF artificially forces a TID scan, because only a TID
scan will work correctly there. I couldn't come up with a convenient
way to artificially inject disable_cost into alternative scan types,
in the less invasive style of isCurrentOf, because there is no
convenient qual to target within cost_qual_eval().

As in previous incarnations, we lock each tuple (although, of course,
only with the UPDATE variant). We may or may not also actually proceed
with the update, depending on whether or not the user-specified
special update predicate (if any) is satisfied. But if we do,
EvalPlanQual() is (once the tuple is locked) only ever evaluated on a
conclusively committed and locked-by-us conflict tuple as part of the
process of updating, even though it's possible for the UPDATE
predicate to be satisfied where conceivably it would not be satisfied
by the tuple version actually visible to the command's MVCC snapshot.
I think this is the correct behavior. We all seem to be in agreement
that we should update at READ COMMITTED if *no* version of the tuple
is visible. It seems utterly arbitrary to me to suggest that on the
one hand it's okay to introduce one particular "MVCC violation", but
not another equivalent one. The first scenario is one in which we
update despite our update's (or rather insert's) "predicate" not being
satisfied (according to our MVCC snapshot). The second scenario is one
in which the same "predicate" is also not satisfied according to our
MVCC snapshot, but in a slightly different way. Why bother introducing
a complicated distinction, if it's a distinction without a difference?
I'd rather have a behavior that is consistent, easy to reason about,
and easy to explain. And so, the predicate is considered once, after
conclusively locking a conflict tuple.

It feels natural and appropriate to me that if the special UPDATE qual
isn't satisfied, we still lock the tuple. After all, in order to make
a conclusive determination about the qual not being satisfied, we need
to lock the tuple. This happens to insulate ExecUpdate() from having
to care about "invisible tuples", which are now possible (although we
still throw an error, just with a useful error message that phrases
the problem in reference to this new feature).

Of course, at higher isolation levels serialization errors are thrown
when something inconsistent with the higher level's guarantees would
otherwise need to occur (even for the IGNORE variant). Still,
interactions with SSI, and preserving the guarantees of SSI should
probably be closely considered by a subject matter expert.

Omission
=======

The patch currently lacks a way of referencing datums rejected for
insertion when updating. The way MySQL handles the issue seems
questionable. They allow you to do something like this:

INSERT INTO upsert (key, val) VALUES (1 'val') ON DUPLICATE KEY UPDATE
val = VALUES(val);

The implication is that the updated value comes from the INSERT's
VALUES() list, but emulating that seems like a bad idea. In general,
at least with Postgres it's entirely possible that values rejected
differ from the values appearing in the VALUES() list, due to the
effects of before triggers. I'm not sure whether or not we should
assume equivalent transformations during any UPDATE before triggers.

This is an open item. I think it makes sense to deal with it a bit later.

"Value locking"
===========

To date, on-list discussion around UPSERT has almost exclusively
concerned what I've called "value locking"; the idea of locking values
in unique indexes in the abstract (to establish the right to insert
ahead of time). There was some useful discussion on this question
between myself and Heikki back around December/January. Ultimately, we
were unable to reach agreement on an approach and discussion tapered
off. However, Heikki did understand the concerns that informed by
design. He recognized the need to be able to easily *release* value
locks, so as to avoid "unprincipled deadlocks", where under high
concurrency there are deadlocks between sessions that only UPSERT a
single row at a time. I'm not sure how widely appreciated this point
is, but I believe that Heikki appreciates it. It is a very important
point in my opinion. I don't want an implementation that is in any way
inferior to the "UPSERT looping subxact" pattern does (i.e. the plpsql
thing that the docs suggest).

When we left off, Heikki continued to favor an approach that involved
speculatively inserting heap tuples, and then deleting them in the
event of a conflict. This design was made more complicated when the
need to *release* value locks became apparent (Heikki ended up making
some changes to HeapTupleSatisfiesDirty(), as well as sketching a
design for what you might call a "super delete", where xmin can be set
to InvalidTransactionId for speculatively-inserted heap tuples). After
all, it wasn't as if we could abort a subxact to release locks, which
is what the "UPSERT looping subxact" pattern does. I think it's fair
to say that that design became more complicated than initially
anticipated [4] [5].

Anyway, the greater point here is that fundamentally, AFAICT Heikki
and I were in agreement. Once you buy into the idea that we must avoid
holding on to "value locks" of whatever form - as Heikki evidently did
- then exactly what form they take is ultimately only a detail.
Granted, it's a very important detail, but a detail nonetheless. It
can be discussed entirely independently of all of this new stuff, and
thank goodness for that.

If anyone finds my (virtually unchanged) page heavyweight lock based
value locking approach objectionable, I ask that the criticism be
framed in a way that makes a sharp distinction between each of the
following:

1. You don't accept that value locks must be easily released in the
event of a conflict. Is anyone in this camp? It's far from obvious to
me what side of this question Andres is on at this stage, for example.
Robert might have something to say here too.

2. Having taken into account the experience of myself and Heikki, and
all that is implied by taking that approach ***while avoiding
unprincipled deadlocks***, you continue to believe that an approach
based on speculative heap insertion, or some alternative scheme is
better than what I have done to the nbtree code here, or you otherwise
dislike something about the proposed value locking scheme. You accept
that value locks must be released and released easily in the event of
a conflict, but like Heikki you just don't like what I've done to get
there.

Since we can (I believe) talk about the value locking aspect and the
rest of the patch independently, we should do so...unless you're in
camp 1, in which case I guess that we'll have to thrash it out.

Syntax, footguns
=============

As I mentioned, I have incorporated feedback from Kevin Grittner. You
may specify a unique index to merge on from within the INSERT
statement, thus avoiding the risk of inadvertently having the update
affect the wrong tuple due to the user failing to consider that there
was a would-be unique violation within some other unique index
constraining some other attribute. You may write the DML statement
like this:

INSERT INTO upsert(key, val) VALUES(1, 'insert') ON CONFLICT WITHIN
upsert_pkey UPDATE SET val = 'update';

I think that there is a good chance that at least some people will
want to make this mandatory. I guess that's fair enough, but I
*really* don't want to *mandate* that users specify the name of their
unique index in DML for obvious reasons. Perhaps we can come up with a
more tasteful syntax that covers all interesting cases (consider the
issues with partial unique indexes and before triggers for example,
where a conclusion reached about which index to use during parse
analysis may subsequently be invalidated by user-defined code, or
ambiguous specifications in the face of overlapping attributes between
two unique composite indexes, etc). The Right Thing is far from
obvious, and there is very little to garner from other systems, since
SQL MERGE promises essentially nothing about concurrency, both as
specified by the standard and in practice. You don't need a unique
index at all, and as I showed in my pgCon talk, there are race
conditions even for a trivial UPSERT operations in all major SQL MERGE
implementations.

Note that making mandatory (via syntax) merging on one particular
unique index buys the implementation no useful leeway. Just for
example, the unprincipled deadlocks test case that illustrated the
problem with early "promise tuple" style approaches to value locking
[6] involved only a single unique index. AFAICT, the question of
whether or not this should be mandatory is just a detail of the
feature's high level design, as opposed to something expected to
significantly influence the implementation.

Testing, performance
===============

As you'd expect, I've included both isolation tests and regression
tests covering a reasonable variety of cases. In addition, stress
testing is an important part of my testing strategy. Reviewers are
encouraged to try out these test bash scripts:

https://github.com/petergeoghegan/upsert

(Interested hackers should request collaborator status on that Github
project from me privately. I welcome new, interesting test cases.)

The performance of the patch seems quite good, and is something that
these stress-testing bash scripts also test. Upserts are compared
against "equivalent" inserts when we know we'll never update, and
against "equivalent" updates when we know we'll never insert.

On an 8 core test server, I can sustain ~90,000 ordinary insert
transactions per second on an unlogged table defined as follows:

create unlogged table foo
(
  merge serial primary key,
  b int4,
  c text
);

In all cases pgbench uses 8 clients (1 per CPU core).

With "equivalent" upserts, it's about ~66,000 TPS. But this is a
particularly unsympathetic case, because I've deliberately exaggerated
the effects of heavyweight lock contention on leaf pages by using a
serial primary key. Plus, there's the additional planning and parsing
overhead.

When comparing updating with updating upserting, it's a similar story.
100,000 tuples are pre-inserted in each case. I can sustain ~98,000
TPS with plain updates, or ~70,000 TPS with "equivalent" upserts.
B-Tree index page heavyweight lock contention probably explains some
of the difference between "UPSERT inserts" and "UPSERT updates".

Interlocking with VACUUM, race conditions
===============================

In previous revisions, when we went to lock + update a tuple, no
"value locks" were held, and neither were any B-Tree page buffer pins,
because they were both released at the same time (recall that I call
my heavyweight lock on B-Tree leaf pages a value lock). We still do
that (unprincipled deadlocks are our only alternative), but now hold
on to the pin for longer, until after tuple locking. Old versions of
this patch used to sit on the B-Tree buffer pin to prevent concurrent
deletion only as long as value locks were held, but maybe it isn't
good enough to sit on the pin until before we lock/update, as value
locks are released: dropping the pin implies that the heap tuple can
physically go away, and in general the same TID may then contain
anything. We may have to interlock against vacuum by sitting on the
B-Tree buffer pin (but not the value lock) throughout locking +
update. That makes it impossible for the heap tuple slot to fail to
relate to the tuple from the B-Tree, that is under consideration for
locking/updating. Recall that we aren't quite dealing with MVCC
semantics here, since in READ COMMITTED mode we can lock a
conclusively committed + visible tuple with *no* version visible to
our command's MVCC snapshot. Therefore, it seems worth considering the
possibility that the nbtree README's observations on the necessity of
holding a pin to interlock against VACUUM (for non-MVCC snapshots)
apply.

In this revision we have two callbacks (or two calls to the same
callback, with different effects): One to release value locks early,
to avoid unprincipled deadlocks, and a second to finally release the
last unneeded buffer pin.

Recall that when we find a conflict (within _bt_checkunique()), it
must be conclusively committed and visible to new MVCC snapshots; we
know at that juncture that it's live. The concern is that it might be
deleted *and* garbage collected in the interim between finding the
conflict tuple, and locking it (in practice this interim period is
only an instant).

This is probably too paranoid, though: the fact that the upserter's
transaction is running ought to imply that GetOldestXmin() returns an
XID sufficient to prevent this. OTOH, I'm not sure that there exists
anything that looks like a precedent for relying on blocking vacuum in
this manner, and it might turn out to be limiting to rely on this.
And, I hasten to add, my fix (sitting on a B-Tree pin throughout row
locking) is in another way perhaps not paranoid enough: Who is to say
that our conflicting value is on the same B-Tree leaf page as our
value lock? If might not be, since _bt_checkunique() looks at later
B-Tree pages (the value locked page is merely "the first leaf page the
value could be on"). Pinning the heavyweight lock page's buffer is
certainly justified by the need for non-speculative inserters to see a
flag that obligates them to acquire the heavyweight page lock
themselves (see comments in patch for more), but this other reason is
kind of dubious.

In other words: I'm relying on the way VACUUM actually works to
prevent premature garbage collection. It's possible to imagine a world
in which HeapTupleSatisfiesVacuum() is smart enough to realize that
the tuple UPSERT wants to lock is not visible to anyone (assuming MVCC
semantics, etc), and never can be. I've tentatively added code to keep
a buffer pin for longer, but that's probably not good enough if we
assume that it's necessary at all. Basically, I want to be comfortable
about my rationale for it being okay that a "non-MVCC" "index scan"
doesn't hold a pin, but right now I'm not. I was conflicted on whether
or not I should include the "unpin later" logic at all; for now I've
left it in, if only as a placeholder. Needless to say, if there is a
race condition you can take it that it's very difficult to isolate.

FWIW, somewhat extensive stress-testing has revealed no bugs that you
might associate with these problems, with and without extended buffer
pinning, and with artificial random sleeps added at key points in an
effort to make any race condition bugs manifest themselves. I have
made a concerted effort to break the patch in that way, and I'm now
running out of ideas. Running the stress tests (with random delays in
key points in the code) for several days reveals no bugs. This is on
the same dedicated 8 core server, with plenty of concurrency.

It's probably a good idea to begin using my B-Tree verification tool
[7] for testing...on the other hand, it doesn't know anything about
MVCC, and will only detect the violation of invariants that are
localized to the B-Tree code, at least at the moment.

Open items
=========

I already mentioned the inability to reference rejected rows in an
UPDATE, as well as my unease about VACUUM interlocking, both of which
are open item. Also, some of the restrictions that I already mentioned
- on updatable views, inheritance, and foreign tables - are probably
unnecessary. We should be able to come with reasonable behavior for at
least some of those.

Patch
====

I thought that I went too long without posting something about all of
this to the list to get feedback, and so I decided to post this WIP
patch set. I've tried to break it up into pieces, but it isn't all
that suitable for representing as cumulative commits. I've also tried
to break up the discussion usefully (the question of how everything
fits together at a high level can hopefully be discussed separately
from the question of how "value locks" are actually implemented).

Thoughts?

[1] http://www.pgcon.org/2014/schedule/attachments/327_upsert_weird.pdf,
("Goals for UPSERT in Postgres")
[2] http://www.postgresql.org/message-id/CAM3SWZRP0c3g6+aJ=YYDGYAcTZg0xA8-1_FCVo5Xm7hrEL34kw@mail.gmail.com
[3] https://sqlite.org/lang_conflict.html
[4] http://www.postgresql.org/message-id/CAM3SWZQoArVQGMi=v-jk3sBjsPg+wdjeUkM_6L5TZG_i9pyGzQ@mail.gmail.com
[5] http://www.postgresql.org/message-id/52B4AAF0.5090806@vmware.com
[6] http://www.postgresql.org/message-id/CAM3SWZShbE29KpoD44cVc3vpZJGmDer6k_6FGHiSzeOZGmTFSQ@mail.gmail.com
[7] http://www.postgresql.org/message-id/CAM3SWZRtV+xmRWLWq6c-x7czvwavFdwFi4St1zz4dDgFH4yN4g@mail.gmail.com
--
Peter Geoghegan

Attachment

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Andreas Karlsson
Date:
On 08/28/2014 04:43 AM, Peter Geoghegan wrote:
> -- Nesting within wCTE:
> WITH t AS (
>      INSERT INTO z SELECT i, 'insert'
>      FROM generate_series(0, 16) i
>      ON CONFLICT UPDATE SET v = v || 'update' -- use of
> operators/functions in targetlist
>      RETURNING * -- only projects inserted tuples, never updated tuples
> )
> SELECT * FROM t JOIN y ON t.k = y.a ORDER BY a, k;

Personally I would find it surprising if RETURNING did not also return 
the updated tuples. In many use cases for upsert the user does not care 
if the row was new or not.

What I think would be useful is if all tuples were returned but there 
was some way to filter out only the inserted ones.

-- 
Andreas Karlsson



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Aug 28, 2014 at 7:29 AM, Andreas Karlsson <andreas@proxel.se> wrote:
> Personally I would find it surprising if RETURNING did not also return the
> updated tuples. In many use cases for upsert the user does not care if the
> row was new or not.

I'm not attached to that particular behavior, but it does seem kind of
similar to the behavior of BEFORE triggers, where a NULL return value
("do nothing") will also cause RETURNING to not project the tuple.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Andreas Karlsson
Date:
On 08/28/2014 09:05 PM, Peter Geoghegan wrote:
> On Thu, Aug 28, 2014 at 7:29 AM, Andreas Karlsson <andreas@proxel.se> wrote:
>> Personally I would find it surprising if RETURNING did not also return the
>> updated tuples. In many use cases for upsert the user does not care if the
>> row was new or not.
>
> I'm not attached to that particular behavior, but it does seem kind of
> similar to the behavior of BEFORE triggers, where a NULL return value
> ("do nothing") will also cause RETURNING to not project the tuple.

I see. So we have three cases where we may or may not want to project a 
tuple.

1) The tuple was inserted
2) We got a conflict and updated the tuple
3) We got a conflict but skipped updating the tuple

My personal intuition was that (1) and (2) would be returned but not 
(3). But I am not sure if that is the most useful behavior.

-- 
Andreas Karlsson



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Wed, Aug 27, 2014 at 7:43 PM, Peter Geoghegan <pg@heroku.com> wrote:
> There are some restrictions on what this auxiliary update may do, but
> FWIW there are considerably fewer than those that the equivalent MySQL
> or SQLite feature imposes on their users.


I realized that I missed a few cases here. For one thing, the posted
patch fails to arrange for the UPDATE post-parse-analysis tree
representation to go through the rewriter stage (on the theory that
user-defined rules shouldn't be able to separately affect the
auxiliary UPDATE query tree), but rewriting is at least necessary so
that rewriteTargetListIU() can expand a "SET val = DEFAULT"
targetlist, as well as normalize the ordering of the UPDATE's tlist.
Separately, the patch fails to defend against certain queries that
ought to be disallowed, where a subselect is specified with a subquery
expression in the auxiliary UPDATE's WHERE clause.

These are garden-variety bugs that aren't likely to affect the kind of
high-level design discussion that I'm looking for here. I'll post a
fixed version in a few days time.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Aug 28, 2014 at 8:05 PM, Peter Geoghegan <pg@heroku.com> wrote:
> I realized that I missed a few cases here. For one thing, the posted
> patch fails to arrange for the UPDATE post-parse-analysis tree
> representation to go through the rewriter stage (on the theory that
> user-defined rules shouldn't be able to separately affect the
> auxiliary UPDATE query tree), but rewriting is at least necessary so
> that rewriteTargetListIU() can expand a "SET val = DEFAULT"
> targetlist, as well as normalize the ordering of the UPDATE's tlist.
> Separately, the patch fails to defend against certain queries that
> ought to be disallowed, where a subselect is specified with a subquery
> expression in the auxiliary UPDATE's WHERE clause.

Attached revision fixes all of these issues. I've added regression
tests for each bug, too, although all changes are rebased into my
original commits.

I decided to explicitly rely on a simpler approach to VACUUM
interlocking. I no longer bother holding on to a buffer pin for a
period longer than the period that associated "value locks" are held,
which was something I talked about at the start of this thread. There
is a note on this added to the nbtree README, just after the master
branch's current remarks on B-Tree VACUUM interlocking.

I've also pushed the responsibility for supporting this new feature on
foreign tables onto FDWs themselves. The only writable FDW we
currently ship, postgres_fdw, lacks support for the new feature, but
this can be revisited in due course. My impression is that the task of
adding support is not quite a straightforward matter of adding a bit
more deparsing logic, but also isn't significantly more difficult than
that.

--
Peter Geoghegan

Attachment

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Robert Haas
Date:
On Wed, Aug 27, 2014 at 10:43 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Example usage:
>
> INSERT INTO upsert(key, val) VALUES(1, 'insert') ON CONFLICT UPDATE
> SET val = 'update';

I think that syntax is a dramatic improvement over your previous
proposals.  The only part I don't entire like is this:

> INSERT INTO upsert(key, val) VALUES(1, 'insert') ON CONFLICT WITHIN
> upsert_pkey UPDATE SET val = 'update';

It seems to me that it would be better to specify a conflicting column
set rather than a conflicting index name.

I don't have much in the way of comments about the implementation, at
least not right at the moment, but...

> Essentially, the implementation has all stages of query processing
> During the execution of the parent ModifyTable, a special auxiliary
> subquery (the UPDATE ModifyTable) is considered as a special case.
> This is not a subplan of the ModifyTable node in the conventional
> sense, and so does not appear within EXPLAIN output.

...that sounds wonky.

> I already mentioned the inability to reference rejected rows in an
> UPDATE, as well as my unease about VACUUM interlocking, both of which
> are open item. Also, some of the restrictions that I already mentioned
> - on updatable views, inheritance, and foreign tables - are probably
> unnecessary. We should be able to come with reasonable behavior for at
> least some of those.

If you've noted my comments on the UPDATE/DELETE .. ORDER BY .. LIMIT
thread, you won't be surprised to hear that I think those restrictions
will need to be lifted - especially for inheritance, but probably the
others as well.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Wed, Sep 3, 2014 at 9:51 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Essentially, the implementation has all stages of query processing
>> During the execution of the parent ModifyTable, a special auxiliary
>> subquery (the UPDATE ModifyTable) is considered as a special case.
>> This is not a subplan of the ModifyTable node in the conventional
>> sense, and so does not appear within EXPLAIN output.
>
> ...that sounds wonky.

Which part? It certainly wouldn't be helpful if the (say) auxiliary
plan's "sequential scan" appeared within EXPLAIN output. That's just
an implementation detail. Note that the structure of the plan is
highly restricted, since it needs to be "driven by the insert" (or,
rather, the insert's conflicts, including conflicts not visible to the
command's MVCC snapshot). There won't be any interesting variation in
the plan. Although, that said, the implementation should probably
display any "Filter: ..." conditions implied by the special UPDATE
qual.

> If you've noted my comments on the UPDATE/DELETE .. ORDER BY .. LIMIT
> thread, you won't be surprised to hear that I think those restrictions
> will need to be lifted - especially for inheritance, but probably the
> others as well.

Sure.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Wed, Sep 3, 2014 at 9:51 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> INSERT INTO upsert(key, val) VALUES(1, 'insert') ON CONFLICT WITHIN
>> upsert_pkey UPDATE SET val = 'update';
>
> It seems to me that it would be better to specify a conflicting column
> set rather than a conflicting index name.

I'm open to pursuing that, provided there is a possible implementation
that's robust against things like BEFORE triggers that modify
constrained attributes. It must also work well with partial unique
indexes. So I imagine we'd have to determine a way of looking up the
unique index only after BEFORE triggers fire. Unless you're
comfortable with punting on some of these cases by throwing an error,
then all of this is actually surprisingly ticklish. You've already
expressed concerns about the feature not playing nice with existing,
peripheral features though.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Robert Haas
Date:
On Wed, Sep 3, 2014 at 2:13 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Wed, Sep 3, 2014 at 9:51 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> Essentially, the implementation has all stages of query processing
>>> During the execution of the parent ModifyTable, a special auxiliary
>>> subquery (the UPDATE ModifyTable) is considered as a special case.
>>> This is not a subplan of the ModifyTable node in the conventional
>>> sense, and so does not appear within EXPLAIN output.
>>
>> ...that sounds wonky.
>
> Which part? It certainly wouldn't be helpful if the (say) auxiliary
> plan's "sequential scan" appeared within EXPLAIN output. That's just
> an implementation detail. Note that the structure of the plan is
> highly restricted, since it needs to be "driven by the insert" (or,
> rather, the insert's conflicts, including conflicts not visible to the
> command's MVCC snapshot). There won't be any interesting variation in
> the plan. Although, that said, the implementation should probably
> display any "Filter: ..." conditions implied by the special UPDATE
> qual.

I think there shouldn't be any plan nodes in the system that don't get
displayed by explain.  If you're using a plan node for something, and
think it shouldn't be displayed by explain, then either (1) you are
wrong or (2) you are abusing the plan node.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Sep 4, 2014 at 8:03 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> I think there shouldn't be any plan nodes in the system that don't get
> displayed by explain.  If you're using a plan node for something, and
> think it shouldn't be displayed by explain, then either (1) you are
> wrong or (2) you are abusing the plan node.

Maybe. I admit that I'm not entirely confident that the representation
of the auxiliary state during planning and execution is ideal.
However, it sure is convenient to be able to separately plan the
auxiliary query as a subquery, and not have to specially fish it out
of the subplan list later. Maybe we should add a mechanism that
essentially generates an equivalent, single ModifyTable plan. Or maybe
that would be adding a lot of code for no tangible benefit. I don't
see much point in making one ModifyTable node pull up from the other
for the benefit of this feature (which is another thing entirely to
having there be a single ModifyTable plan). For now, I'm glad to have
something that will allow us to drive discussion of the feature to the
next level. I don't have a good enough understanding of the optimizer
to be able to say with confidence what we should do, or to be able to
see the big picture of making any particular trade-off. It's not an
immediate concern, though.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Sep 4, 2014 at 11:55 AM, Peter Geoghegan <pg@heroku.com> wrote:
> It's not an immediate concern, though.

My immediate concern is to get some level of buy-in about how
everything fits together at a high level. Separately, as discussed in
my opening mail, there is the question of how value locking should
ultimately be implemented. These are two orthogonal questions, or are
pretty close to orthogonal. That helps. It also helps that people have
stopped being confused by the term "value locking" (I think).

I'm tempted to believe that the silence on the question of how things
fit together (such as the lack of discussion of my pgCon talk's
characterization of a "pick any 2" trade-off) means that that's
because everyone agrees with that. That seems pretty naive, though,
because a lot of the issues are very subtle. I think that various
interested people, including Robert and Andres have yet to make their
minds up on that. I'm not sure what Tom thinks of it.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Wed, Aug 27, 2014 at 7:43 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Omission
> =======
>
> The patch currently lacks a way of referencing datums rejected for
> insertion when updating.

Attached revision of the patch set (which I'll call v1.2) adds this
capability in a separate commit. It now becomes possible to add a
CONFLICTING expression within the ON CONFLICT UPDATE targetlist or
predicate. Example use:

"""
postgres=# CREATE TABLE upsert(key int4 PRIMARY KEY, val text);
CREATE TABLE
postgres=# INSERT INTO upsert VALUES(1, 'Giraffe');
INSERT 0 1
postgres=# SELECT * FROM upsert;
 key |   val
-----+---------
   1 | Giraffe
(1 row)

postgres=# INSERT INTO upsert VALUES(1, 'Bear'), (2, 'Lion') ON
CONFLICT UPDATE SET val = CONFLICTING(val);
INSERT 0 1
postgres=# SELECT * FROM upsert;
 key | val
-----+------
   1 | Bear
   2 | Lion
(2 rows)

"""

Note that the effects of BEFORE INSERT triggers are carried here,
which I slightly favor over the alternative of not having it work that
way.

I've also expanded upon my explanation for the structure of the query
tree and plan within (revised/rebased versions of) earlier commits. I
am clearer on why there is a special subquery planning step for the
auxiliary UPDATE, rather than making the UPDATE directly accessible as
a subquery within the post-parse-analysis query tree. Basically, the
optimizer has no basis for understanding that a DML sublink isn't
optimizable. It'll try to pull-up the subquery and so on, which of
course does not and cannot work. Whereas treating it as an
independently planned subquery of the top-level query, kind of like a
data-modifying CTE makes sense (with such CTEs, the executor is
prepared for the possibility that not all rows will be pulled up - so
there too, the executor drives execution more directly than makes
sense when not dealing with DML: it finishes off the data-modifying
CTE's DML for any still-unconsumed tuples, within
ExecPostprocessPlan()).

It's certainly possible that a more unified representation makes sense
(i.e. one ModifyTable plan, likely still having seperate INSERT/UPDATE
representations at earlier stages of query processing), but that would
require serious refactoring of the representation of ModifyTable
operations -- just for example, consider the need for a
unified-though-separate targetlist, one for the INSERT part, the other
for the UPDATE part. For now, I continue to find it very convenient to
represent the UPDATE as a selectively executed, auxiliary, distinct
ModifyTable plan, rather than adding a subquery rangetable directly
during parse analysis.

There is another significant change. In this revision, I am at least
"honest" about the plan representation within EXPLAIN:

"""
postgres=# EXPLAIN ANALYZE INSERT INTO upsert VALUES(1, 'Bear'), (2,
'Lion') ON CONFLICT UPDATE SET val = CONFLICTING(val);
                                                    QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Insert on upsert  (cost=0.00..0.03 rows=2 width=36) (actual
time=0.115..0.115 rows=0 loops=1)
   ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 width=36)
(actual time=0.003..0.005 rows=2 loops=1)
   ->  Conflict Update on upsert  (cost=0.00..22.30 rows=1230
width=36) (actual time=0.042..0.051 rows=0 loops=1)
         ->  Seq Scan on upsert  (cost=0.00..22.30 rows=1230 width=36)
(never executed)
 Planning time: 0.065 ms
 Execution time: 0.158 ms
(6 rows)

postgres=# EXPLAIN ANALYZE INSERT INTO upsert VALUES(1, 'Bear'), (2,
'Lion') ON CONFLICT UPDATE SET val = CONFLICTING(val) where key = 2;
                                                  QUERY PLAN
--------------------------------------------------------------------------------------------------------------
 Insert on upsert  (cost=0.00..0.03 rows=2 width=36) (actual
time=0.075..0.075 rows=0 loops=1)
   ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 width=36)
(actual time=0.001..0.002 rows=2 loops=1)
   ->  Conflict Update on upsert  (cost=4.16..8.17 rows=1 width=36)
(actual time=0.012..0.026 rows=0 loops=1)
         ->  Bitmap Heap Scan on upsert  (cost=4.16..8.17 rows=1
width=36) (never executed)
               Recheck Cond: (key = 2)
               ->  Bitmap Index Scan on upsert_pkey  (cost=0.00..4.16
rows=1 width=0) (never executed)
                     Index Cond: (key = 2)
 Planning time: 0.090 ms
 Execution time: 0.125 ms
(9 rows)

"""

The second query gets a bitmap scan because plain index scans have
been disabled for the UPDATE (a temporary kludge), since index-only
scans can break things - IndexOnlyRecheck() throws an error. Not quite
sure why the optimizer doesn't care about resjunk for the UPDATE,
which is presumably why in general regular updates never use
index-only scans. Since I think the actual auxiliary plan generation
needs work, so as to not have uselessly complicated plans, I didn't
try too hard to figure that out. This plan structure is not
acceptable, of course, but maybe almost the same thing would be
acceptable if the auxiliary plan shown here wasn't unnecessarily
complex - if we forced a simple pseudo-scan placeholder, without
wasting optimizer cycles, somewhat in the style of WHERE CURRENT OF.
This is something discussed in newly expanded comments within
planner.c. I would have made the optimizer produce a suitably simple
plan myself, but I don't have a good enough understanding of it to
figure out how (at least in a reasonable amount of time). Pointers on
how this might be accomplished are very welcome.

With this addition, the feature is functionally complete. That just
leaves the small matter of how it has been implemented.  :-)

This is still clearly a work in progress implementation, with design
trade-offs that are very much in need of fairly high level discussion.
--
Peter Geoghegan

Attachment

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 28 August 2014 03:43, Peter Geoghegan <pg@heroku.com> wrote:

> The patch currently lacks a way of referencing datums rejected for
> insertion when updating. The way MySQL handles the issue seems
> questionable. They allow you to do something like this:
>
> INSERT INTO upsert (key, val) VALUES (1 'val') ON DUPLICATE KEY UPDATE
> val = VALUES(val);
>
> The implication is that the updated value comes from the INSERT's
> VALUES() list, but emulating that seems like a bad idea. In general,
> at least with Postgres it's entirely possible that values rejected
> differ from the values appearing in the VALUES() list, due to the
> effects of before triggers. I'm not sure whether or not we should
> assume equivalent transformations during any UPDATE before triggers.
>
> This is an open item. I think it makes sense to deal with it a bit later.

IMHO it is impossible to know if any of the other code is correct
until we have a clear and stable vision of what the command is
supposed to perform.

The inner workings are less important than what the feature does.

FWIW, the row available at the end of all BEFORE triggers is clearly
the object we should be manipulating, not the original VALUES()
clause. Otherwise this type of INSERT would behave differently from
normal INSERTs. Which would likely violate RLS, if nothing else.


> As I mentioned, I have incorporated feedback from Kevin Grittner. You
> may specify a unique index to merge on from within the INSERT
> statement, thus avoiding the risk of inadvertently having the update
> affect the wrong tuple due to the user failing to consider that there
> was a would-be unique violation within some other unique index
> constraining some other attribute. You may write the DML statement
> like this:
>
> INSERT INTO upsert(key, val) VALUES(1, 'insert') ON CONFLICT WITHIN
> upsert_pkey UPDATE SET val = 'update';
>
> I think that there is a good chance that at least some people will
> want to make this mandatory. I guess that's fair enough, but I
> *really* don't want to *mandate* that users specify the name of their
> unique index in DML for obvious reasons. Perhaps we can come up with a
> more tasteful syntax that covers all interesting cases (consider the
> issues with partial unique indexes and before triggers for example,
> where a conclusion reached about which index to use during parse
> analysis may subsequently be invalidated by user-defined code, or
> ambiguous specifications in the face of overlapping attributes between
> two unique composite indexes, etc). The Right Thing is far from
> obvious, and there is very little to garner from other systems, since
> SQL MERGE promises essentially nothing about concurrency, both as
> specified by the standard and in practice. You don't need a unique
> index at all, and as I showed in my pgCon talk, there are race
> conditions even for a trivial UPSERT operations in all major SQL MERGE
> implementations.

Surely if there are multiple unique indexes then the result row must
be validated against all unique indexes before it is allowed at all?

The only problem I see is if the newly inserted row matches one row on
one unique value and a different row on a different unique index.
Turning the INSERT into an UPDATE will still fail on one or other, no
matter which index you pick. If there is one row for ALL unique
indexes then it is irrelevant which index you pick. So either way, I
cannot see a reason to specify an index.

If we do need such a construct, we have already the concept of an
IDENTITY for a table, added in 9.4, currently targeted at replication.
Listing indexes or columns in the DML statement is more pushups for
developers and ORMs, so lets KISS.


The way forwards, in my view, is to define precisely the behaviour we
wish to have. That definition will include the best current mechanism
for running an UPSERT using INSERT/UPDATE/loops and comparing that
against what is being provided here. We will then have a functional
test of equivalence of the approaches, and a basis for making a
performance test that shows that performance is increased without any
loss of concurrency.

Once we have that, we can then be certain our time spent on internals
is not wasted by overlooking a simple userland gotcha.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Robert Haas
Date:
On Thu, Sep 25, 2014 at 10:12 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> IMHO it is impossible to know if any of the other code is correct
> until we have a clear and stable vision of what the command is
> supposed to perform.

+1.

> The inner workings are less important than what the feature does.

+1.

> FWIW, the row available at the end of all BEFORE triggers is clearly
> the object we should be manipulating, not the original VALUES()
> clause. Otherwise this type of INSERT would behave differently from
> normal INSERTs. Which would likely violate RLS, if nothing else.

+1.

> Surely if there are multiple unique indexes then the result row must
> be validated against all unique indexes before it is allowed at all?
>
> The only problem I see is if the newly inserted row matches one row on
> one unique value and a different row on a different unique index.
> Turning the INSERT into an UPDATE will still fail on one or other, no
> matter which index you pick. If there is one row for ALL unique
> indexes then it is irrelevant which index you pick. So either way, I
> cannot see a reason to specify an index.

Failure could be the right thing in some cases.  For example, imagine
that a user has a table containing names, email addresses, and (with
apologies for the American-ism, but I don't know what would be
comparable elsewhere) social security numbers.  The user has unique
indexes on both email addresses and SSNs.  If a new record arrives for
the same email address, they want to replace the existing record; but
a new record arrives with the same SSN, they want the transaction to
fail.  Otherwise, a newly-arrived record might overwrite the email
address of an existing record, which they never want to do, because
they view email address as the primary key.

I think this kind of scenario will actually be pretty common.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 25 September 2014 15:35, Robert Haas <robertmhaas@gmail.com> wrote:

>> The only problem I see is if the newly inserted row matches one row on
>> one unique value and a different row on a different unique index.
>> Turning the INSERT into an UPDATE will still fail on one or other, no
>> matter which index you pick. If there is one row for ALL unique
>> indexes then it is irrelevant which index you pick. So either way, I
>> cannot see a reason to specify an index.
>
> Failure could be the right thing in some cases.  For example, imagine
> that a user has a table containing names, email addresses, and (with
> apologies for the American-ism, but I don't know what would be
> comparable elsewhere) social security numbers.  The user has unique
> indexes on both email addresses and SSNs.  If a new record arrives for
> the same email address, they want to replace the existing record; but
> a new record arrives with the same SSN, they want the transaction to
> fail.  Otherwise, a newly-arrived record might overwrite the email
> address of an existing record, which they never want to do, because
> they view email address as the primary key.

I agree with your example, but not your conclusion.

If a new record arrives with a new email address that matches an
existing record it will fail. There is a case that would be allowed,
which would be a record that creates an entirely new email address. So
you do have a point to argue from.

However, IMV enforcing such a restriction should be done with an After
trigger, which is already possible, not by complicating a DML
statement with information it shouldn't need to know, or that might
change in the future.

Let's keep this new feature as simple as possible. ORMs everywhere
need to be encouraged to implement this and they won't do it unless it
is bone simple to use.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Robert Haas
Date:
On Thu, Sep 25, 2014 at 11:21 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 25 September 2014 15:35, Robert Haas <robertmhaas@gmail.com> wrote:
>>> The only problem I see is if the newly inserted row matches one row on
>>> one unique value and a different row on a different unique index.
>>> Turning the INSERT into an UPDATE will still fail on one or other, no
>>> matter which index you pick. If there is one row for ALL unique
>>> indexes then it is irrelevant which index you pick. So either way, I
>>> cannot see a reason to specify an index.
>>
>> Failure could be the right thing in some cases.  For example, imagine
>> that a user has a table containing names, email addresses, and (with
>> apologies for the American-ism, but I don't know what would be
>> comparable elsewhere) social security numbers.  The user has unique
>> indexes on both email addresses and SSNs.  If a new record arrives for
>> the same email address, they want to replace the existing record; but
>> a new record arrives with the same SSN, they want the transaction to
>> fail.  Otherwise, a newly-arrived record might overwrite the email
>> address of an existing record, which they never want to do, because
>> they view email address as the primary key.
>
> I agree with your example, but not your conclusion.
>
> If a new record arrives with a new email address that matches an
> existing record it will fail. There is a case that would be allowed,
> which would be a record that creates an entirely new email address. So
> you do have a point to argue from.
>
> However, IMV enforcing such a restriction should be done with an After
> trigger, which is already possible, not by complicating a DML
> statement with information it shouldn't need to know, or that might
> change in the future.

I've never been a fan of putting the index name in there.  I agree
that's stuff that a DML statement shouldn't need to know about.  What
I've advocated for in the past is specifying the list of columns that
should be used to determine whether to insert or update.  If you have
a match on those columns, update the row; else insert.  Any other
unique indexes stand or fall as may be.

I still think that idea has merit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 28 August 2014 03:43, Peter Geoghegan <pg@heroku.com> wrote:

> "Value locking"
> ===========
>
> To date, on-list discussion around UPSERT has almost exclusively
> concerned what I've called "value locking"; the idea of locking values
> in unique indexes in the abstract (to establish the right to insert
> ahead of time). There was some useful discussion on this question
> between myself and Heikki back around December/January. Ultimately, we
> were unable to reach agreement on an approach and discussion tapered
> off. However, Heikki did understand the concerns that informed by
> design. He recognized the need to be able to easily *release* value
> locks, so as to avoid "unprincipled deadlocks", where under high
> concurrency there are deadlocks between sessions that only UPSERT a
> single row at a time. I'm not sure how widely appreciated this point
> is, but I believe that Heikki appreciates it. It is a very important
> point in my opinion. I don't want an implementation that is in any way
> inferior to the "UPSERT looping subxact" pattern does (i.e. the plpsql
> thing that the docs suggest).
>
> When we left off, Heikki continued to favor an approach that involved
> speculatively inserting heap tuples, and then deleting them in the
> event of a conflict. This design was made more complicated when the
> need to *release* value locks became apparent (Heikki ended up making
> some changes to HeapTupleSatisfiesDirty(), as well as sketching a
> design for what you might call a "super delete", where xmin can be set
> to InvalidTransactionId for speculatively-inserted heap tuples). After
> all, it wasn't as if we could abort a subxact to release locks, which
> is what the "UPSERT looping subxact" pattern does. I think it's fair
> to say that that design became more complicated than initially
> anticipated [4] [5].
>
> Anyway, the greater point here is that fundamentally, AFAICT Heikki
> and I were in agreement. Once you buy into the idea that we must avoid
> holding on to "value locks" of whatever form - as Heikki evidently did
> - then exactly what form they take is ultimately only a detail.
> Granted, it's a very important detail, but a detail nonetheless. It
> can be discussed entirely independently of all of this new stuff, and
> thank goodness for that.
>
> If anyone finds my (virtually unchanged) page heavyweight lock based
> value locking approach objectionable, I ask that the criticism be
> framed in a way that makes a sharp distinction between each of the
> following:
>
> 1. You don't accept that value locks must be easily released in the
> event of a conflict. Is anyone in this camp? It's far from obvious to
> me what side of this question Andres is on at this stage, for example.
> Robert might have something to say here too.
>
> 2. Having taken into account the experience of myself and Heikki, and
> all that is implied by taking that approach ***while avoiding
> unprincipled deadlocks***, you continue to believe that an approach
> based on speculative heap insertion, or some alternative scheme is
> better than what I have done to the nbtree code here, or you otherwise
> dislike something about the proposed value locking scheme. You accept
> that value locks must be released and released easily in the event of
> a conflict, but like Heikki you just don't like what I've done to get
> there.
>
> Since we can (I believe) talk about the value locking aspect and the
> rest of the patch independently, we should do so...unless you're in
> camp 1, in which case I guess that we'll have to thrash it out.

I'm trying to understand and help out with pushing this patch forwards
to completion.

Basically, I have absolutely no idea whether I object to or agree with
1) and don't know where to look to find out. We need a clear
exposition of design and the alternatives.

My approach would be to insert an index tuple for that value into the
index, but with the leaf ituple marked with an xid rather than a ctid.
If someone tries to insert into the index they would see this and wait
for the inserting transaction to end. The inserting transaction would
then resolve what happens in the heap (insert/update) and later
repoint the index tuple to the inserted/updated row version. I don't
see the need for page level locking since it would definitely result
in deadlocks (e.g. SQLServer).

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Sep 25, 2014 at 11:17 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Basically, I have absolutely no idea whether I object to or agree with
> 1) and don't know where to look to find out. We need a clear
> exposition of design and the alternatives.
>
> My approach would be to insert an index tuple for that value into the
> index, but with the leaf ituple marked with an xid rather than a ctid.
> If someone tries to insert into the index they would see this and wait
> for the inserting transaction to end. The inserting transaction would
> then resolve what happens in the heap (insert/update) and later
> repoint the index tuple to the inserted/updated row version. I don't
> see the need for page level locking since it would definitely result
> in deadlocks (e.g. SQLServer).

The page level locks are only used to prevent concurrent insertion for
as long as it takes to get consensus to proceed among unique indexes,
and to actually insert a heap tuple. They're all released before we
lock the tuple for update, should we take that path (yes, really).
This is consistent with the behavior of other systems, I think. That's
my whole reason for preferring to do things that way. If you have a
"promise tuples" approach - be it what you outline here, or what
Heikki prototyped with heap tuple insertion, or any other - then you
need a way to *release* those "value locks" in the event of a
conflict/needing to update, before locking/updating. Otherwise, you
get deadlocks. This is an issue I highlighted when it came up with
Heikki's prototype.

AFAICT, any scheme for "value locking" needs to strongly consider the
need to *release* value locks inexpensively. Whatever else they do,
they cannot persist for the duration of the transaction IMV.

Does that make sense? If not, my next suggestion is applying an
earlier revision of Heikki's prototype, and seeing for yourself how it
can be made to deadlock in an unprincipled/impossible to prevent way
[1]. You've quite rightly highlighted the existing subxact looping
pattern as something that this needs to be better than in every way.
This is one important way in which we might fail to live up to that
standard.

[1] http://www.postgresql.org/message-id/52B4AAF0.5090806@vmware.com
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Sep 25, 2014 at 7:35 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Sep 25, 2014 at 10:12 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> IMHO it is impossible to know if any of the other code is correct
>> until we have a clear and stable vision of what the command is
>> supposed to perform.
>
> +1.
>
>> The inner workings are less important than what the feature does.
>
> +1.
>
>> FWIW, the row available at the end of all BEFORE triggers is clearly
>> the object we should be manipulating, not the original VALUES()
>> clause. Otherwise this type of INSERT would behave differently from
>> normal INSERTs. Which would likely violate RLS, if nothing else.
>
> +1.

I agree with all of this. I'm glad that my opinion on how a
CONFLICTING() expression interacts with BEFORE triggers is accepted,
too.


-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Sep 25, 2014 at 9:20 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> I've never been a fan of putting the index name in there.

Me neither. Although I do understand Kevin's concern about the user's
intent surrounding which unique index to merge on.

> I agree
> that's stuff that a DML statement shouldn't need to know about.  What
> I've advocated for in the past is specifying the list of columns that
> should be used to determine whether to insert or update.  If you have
> a match on those columns, update the row; else insert.  Any other
> unique indexes stand or fall as may be.
>
> I still think that idea has merit.

As I've said, my problem with that idea is the corner cases. Consider
the possible ambiguity. Could DML queries in production start failing
("ambiguous unique index specification") because the DBA created a new
unique index on attributes that somewhat overlap the attributes of the
unique index that the DML author actually meant? What about the
effects of BEFORE triggers, and their interaction with partial unique
indexes? If you can describe an exact behavior that overcomes these
issues, then I'll give serious consideration to implementing it. As
things stand, to be perfectly honest it sounds like a footgun to me.
There are interactions that make getting it right very ticklish.

I don't want to make a unique index specification mandatory because
that's ugly - that's the only reason, TBH. However, while what you
describe here accomplishes the same thing without being ugly, it is
potentially very surprising. Naming the unique index directly has the
great advantage of very clearly demonstrating user intent.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Robert Haas
Date:
On Thu, Sep 25, 2014 at 2:17 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> 1. You don't accept that value locks must be easily released in the
>> event of a conflict. Is anyone in this camp? It's far from obvious to
>> me what side of this question Andres is on at this stage, for example.
>> Robert might have something to say here too.
>>
>> 2. Having taken into account the experience of myself and Heikki, and
>> all that is implied by taking that approach ***while avoiding
>> unprincipled deadlocks***, you continue to believe that an approach
>> based on speculative heap insertion, or some alternative scheme is
>> better than what I have done to the nbtree code here, or you otherwise
>> dislike something about the proposed value locking scheme. You accept
>> that value locks must be released and released easily in the event of
>> a conflict, but like Heikki you just don't like what I've done to get
>> there.
>>
>> Since we can (I believe) talk about the value locking aspect and the
>> rest of the patch independently, we should do so...unless you're in
>> camp 1, in which case I guess that we'll have to thrash it out.
>
> I'm trying to understand and help out with pushing this patch forwards
> to completion.
>
> Basically, I have absolutely no idea whether I object to or agree with
> 1) and don't know where to look to find out. We need a clear
> exposition of design and the alternatives.

I laughed when I read this, because I think a lot of the discussion on
this topic has been unnecessarily muddled by jargon.

> My approach would be to insert an index tuple for that value into the
> index, but with the leaf ituple marked with an xid rather than a ctid.
> If someone tries to insert into the index they would see this and wait
> for the inserting transaction to end. The inserting transaction would
> then resolve what happens in the heap (insert/update) and later
> repoint the index tuple to the inserted/updated row version. I don't
> see the need for page level locking since it would definitely result
> in deadlocks (e.g. SQLServer).

I think that something like this might work, but the devil is in the
details.  Suppose two people try to upsert into the same table at the
same time.  There's one index.  If the transactions search that index
for conflicts first, neither sees any conflicting tuples, and both
proceed.  That's no good.  OK, so suppose each transaction inserts the
special index tuple which you mention, to lock out concurrent inserts
of that value, and then searches for already-existing conflicts.  Each
sees the other's tuple, and they deadlock.  That's no good, either.

Also, I think there are other cases where we think we're going to
insert, so we put the special index tuple in there, but then we decide
to update, so we don't need the promise tuple any more, but other
sessions are potentially still waiting for our XID to terminate even
though there's no conflict any more.  I'm having a hard time bringing
the details of those cases to mind ATM, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Sep 25, 2014 at 12:11 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> I think that something like this might work, but the devil is in the
> details.  Suppose two people try to upsert into the same table at the
> same time.  There's one index.  If the transactions search that index
> for conflicts first, neither sees any conflicting tuples, and both
> proceed.  That's no good.  OK, so suppose each transaction inserts the
> special index tuple which you mention, to lock out concurrent inserts
> of that value, and then searches for already-existing conflicts.  Each
> sees the other's tuple, and they deadlock.  That's no good, either.

I'm very glad that you share my concern about deadlocks like this.

> Also, I think there are other cases where we think we're going to
> insert, so we put the special index tuple in there, but then we decide
> to update, so we don't need the promise tuple any more, but other
> sessions are potentially still waiting for our XID to terminate even
> though there's no conflict any more.  I'm having a hard time bringing
> the details of those cases to mind ATM, though.

Well, you might have a promise tuple in a unique index on attributes
not appearing in the UPDATE's targetlist, for one. You have the other
session waiting (doesn't have to be an upserter) just because we
*thought about* inserting a value as part of an upsert. That's pretty
bad.
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Sep 25, 2014 at 7:12 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> The way forwards, in my view, is to define precisely the behaviour we
> wish to have. That definition will include the best current mechanism
> for running an UPSERT using INSERT/UPDATE/loops and comparing that
> against what is being provided here. We will then have a functional
> test of equivalence of the approaches, and a basis for making a
> performance test that shows that performance is increased without any
> loss of concurrency.

That sounds very reasonable.  While I'm sure that what I have here can
decisively beat the xact looping pattern in terms of performance as
measured by pgbench, the real performance advantage is that this
approach doesn't burn through XIDs. That was a concern that Andres
highlighted in relation to using the subxact looping pattern with
BDR's multi-master replication conflict resolution.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 25 September 2014 20:11, Robert Haas <robertmhaas@gmail.com> wrote:

>> My approach would be to insert an index tuple for that value into the
>> index, but with the leaf ituple marked with an xid rather than a ctid.
>> If someone tries to insert into the index they would see this and wait
>> for the inserting transaction to end. The inserting transaction would
>> then resolve what happens in the heap (insert/update) and later
>> repoint the index tuple to the inserted/updated row version. I don't
>> see the need for page level locking since it would definitely result
>> in deadlocks (e.g. SQLServer).
>
> I think that something like this might work, but the devil is in the
> details.  Suppose two people try to upsert into the same table at the
> same time.  There's one index.  If the transactions search that index
> for conflicts first, neither sees any conflicting tuples, and both
> proceed.  That's no good.  OK, so suppose each transaction inserts the
> special index tuple which you mention, to lock out concurrent inserts
> of that value, and then searches for already-existing conflicts.  Each
> sees the other's tuple, and they deadlock.  That's no good, either.

The test index is unique, so our to-be-inserted value exists on only
one page, hence page locking applies while we insert it. The next
person to insert waits for the page lock and then sees the test tuple.

The page lock lasts only for the duration of the insertion of the
ituple, not for the whole operation.

> Also, I think there are other cases where we think we're going to
> insert, so we put the special index tuple in there, but then we decide
> to update, so we don't need the promise tuple any more, but other
> sessions are potentially still waiting for our XID to terminate even
> though there's no conflict any more.  I'm having a hard time bringing
> the details of those cases to mind ATM, though.

We make the decision to INSERT or UPDATE based upon what we find in
the test index. If a value if there already, we assume its an UPDATE
and go to update the row this points to. If it has been deleted we
loop back and try again/error. If the value is not present, we insert
the test tuple and progress as an INSERT, then loop back later to set
the ctid. There is no case of "don't need promise id anymore". We
would use the PK, identity or first unique index as the test index.
There is a case where an UPSERT conflicts with an INSERT causing the
latter to abort.

Anyway, this is why we need the design more clearly exposed, so you
can tell me I'm wrong by showing me the URL of it done right.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 25 September 2014 19:59, Peter Geoghegan <pg@heroku.com> wrote:
> On Thu, Sep 25, 2014 at 9:20 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> I've never been a fan of putting the index name in there.
>
> Me neither. Although I do understand Kevin's concern about the user's
> intent surrounding which unique index to merge on.

The use case cited is real. My solution of using an after trigger
works yet without adding specific functionality to this command. So we
can achieve what users want without complicating things here.

If we do decide we really want it, lets add it as a later patch.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 25 September 2014 20:38, Peter Geoghegan <pg@heroku.com> wrote:
> On Thu, Sep 25, 2014 at 7:12 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> The way forwards, in my view, is to define precisely the behaviour we
>> wish to have. That definition will include the best current mechanism
>> for running an UPSERT using INSERT/UPDATE/loops and comparing that
>> against what is being provided here. We will then have a functional
>> test of equivalence of the approaches, and a basis for making a
>> performance test that shows that performance is increased without any
>> loss of concurrency.
>
> That sounds very reasonable.

So I promise not to discuss locking until we get the first things done.

My suggested approach to get this committed is...

A. UPDATE/INSERT privilege infrastructure.
Add tests to it, make it separately committable, so we can get that done.
Submit to Oct CF; get that done early.

B. Agree command semantics by producing these things
* Explanatory documentation (Ch6.4 Data Manipulation - Upsert)
* SQL Reference Documentation (INSERT)
* Test cases for feature
* Test cases for concurrency
* Test cases for pgbench

All of the above, as a separate committable patch. I hate the fact
that you have written no user facing documentation for this feature.
How can anyone tell whether the tests you've written are correct or
even consistent to a particular definition of correctness?
Submit as patch for review only to Oct 15 CF
We then agree what is required for further work
At this stage, poll the Django and Rails communities for acceptance
and early warning of these features. Listen.

C. Internal weirdness
Submit C based upon earlier agreed B, submit to Dec 15 CF, major patch
deadline, so we can fine tune for last CF.
Then Heikki rewrites half your patch in a better way, you thank him
and then we commit. All done.

> While I'm sure that what I have here can
> decisively beat the xact looping pattern in terms of performance as
> measured by pgbench, the real performance advantage is that this
> approach doesn't burn through XIDs. That was a concern that Andres
> highlighted in relation to using the subxact looping pattern with
> BDR's multi-master replication conflict resolution.

But we're still discussing SQL semantics. So first things first, then
loop back around, hoping our design has not been concurrently
deleted...

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Sep 25, 2014 at 1:21 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> The test index is unique, so our to-be-inserted value exists on only
> one page, hence page locking applies while we insert it. The next
> person to insert waits for the page lock and then sees the test tuple.
>
> The page lock lasts only for the duration of the insertion of the
> ituple, not for the whole operation.

(by page lock, I take it you mean buffer lock - converting that into a
page hwlock is what I do).

This is where it gets quite complicated. What happens if row locking
on upsert finds a conflict update changing uniquely-constrained
attributes? Sure, a vanilla non-HOT update will fail on inserting a
unique index tuple, but *it* can still cause us a row-level conflict,
and *it* can only fail (with a dup violation) when we commit/abort.
But now we're obligated to wait on it to get the row lock, and it's
obligated to wait on us to get the promise tuple lock, or any other
sort of "value lock" that hasn't already been released when we go to
row lock. Deadlock.

You cannot get away with failing to release the promise tuple/value
lock if you want to maintain useful guarantees.

It doesn't need to be a vanilla non-HOT update.  That's just the
simplest example I can think of.

>> Also, I think there are other cases where we think we're going to
>> insert, so we put the special index tuple in there, but then we decide
>> to update, so we don't need the promise tuple any more, but other
>> sessions are potentially still waiting for our XID to terminate even
>> though there's no conflict any more.  I'm having a hard time bringing
>> the details of those cases to mind ATM, though.
>
> We make the decision to INSERT or UPDATE based upon what we find in
> the test index. If a value if there already, we assume its an UPDATE
> and go to update the row this points to. If it has been deleted we
> loop back and try again/error.

Sure, you can throw an error, and that makes things a lot easier. It
also implies that the implementation is inferior to the subxact
looping pattern, which you've already implied is a thing we must beat
in every way. Frankly, I think it's a cop-out to just throw an error,
and I don't think it'll end up being some theoretical risk. It'll
happen often if it is allowed to happen at all. Allowing it to happen
almost defeats the purpose of the feature - the big appeal of the
feature is that it makes guarantees about the outcome.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Sep 25, 2014 at 1:48 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> A. UPDATE/INSERT privilege infrastructure.
> Add tests to it, make it separately committable, so we can get that done.
> Submit to Oct CF; get that done early.

Makes sense. As long as we assume that we want a unified syntax like
this - that is, that we need something vaguely insert-update or
update-insertish - then we need this. Unfortunately, we cannot add
regression tests for this without almost the full patch set.

> B. Agree command semantics by producing these things
> * Explanatory documentation (Ch6.4 Data Manipulation - Upsert)
> * SQL Reference Documentation (INSERT)
> * Test cases for feature
> * Test cases for concurrency
> * Test cases for pgbench

Okay. I do have stress-tests, that are separately maintained, in case
you missed that:

https://github.com/petergeoghegan/upsert

> All of the above, as a separate committable patch. I hate the fact
> that you have written no user facing documentation for this feature.
> How can anyone tell whether the tests you've written are correct or
> even consistent to a particular definition of correctness?

I'd hoped that the commit messages, and my discussion of the feature
were adequate.

> Submit as patch for review only to Oct 15 CF
> We then agree what is required for further work
> At this stage, poll the Django and Rails communities for acceptance
> and early warning of these features. Listen.

I know an original founder of the Django project quite well - Jacob
Kaplan-Moss (a co-worker - the guy that keynoted pgOpen in its second
year). He is very interested in this effort.

> C. Internal weirdness
> Submit C based upon earlier agreed B, submit to Dec 15 CF, major patch
> deadline, so we can fine tune for last CF.
> Then Heikki rewrites half your patch in a better way, you thank him
> and then we commit. All done.

I don't have a problem with Heikki or anyone else rewriting the value
locking part of the patch, provided it meets my requirements for such
a mechanism. Since Heikki already agreed that that standard should be
imposed, he'd hardly take issue with it now.

However, the fact is that once you actually make something like
promise tuples meet that standard, at the very least it becomes a lot
messier than you'd think. Heikki's final prototype "super deleted"
tuples by setting their xmin to InvalidTransactionId. We weren't sure
that that doesn't break some random other heapam code. Consider this,
for example:

https://github.com/postgres/postgres/blob/REL9_4_STABLE/src/backend/executor/execMain.c#L1961

So that looks safe in the face of setting xmin to InvalidTransactionId
in the way the later prototype patch did if you think about it for a
while, but there are other places where that is less clear. In short,
it becomes something that we have to worry about for ever, because
"xmin cannot change without the tuple in the slot changing" is clearly
an invariant for certain purposes. It might accidentally fail to fail
right now, but I'm not comfortable with it.

Now, I might be convinced that that's actually the way to go. I have
an open mind. But that will take discussion. I like that page
hwlocking is something that many systems do (even including Oracle, I
believe). Making big changes to nbtree is always something that
deserves to be met with skepticism, but it is nice to have an
implementation that lives in the head of AM.

Sorry, I forgot to not talk about locking.

> But we're still discussing SQL semantics. So first things first, then
> loop back around, hoping our design has not been concurrently
> deleted...

I hope the discussion can avoid "unprincipled deadlocks"....

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 25 September 2014 22:13, Peter Geoghegan <pg@heroku.com> wrote:

>> All of the above, as a separate committable patch. I hate the fact
>> that you have written no user facing documentation for this feature.
>> How can anyone tell whether the tests you've written are correct or
>> even consistent to a particular definition of correctness?
>
> I'd hoped that the commit messages, and my discussion of the feature
> were adequate.

I'd hoped that my discussion was sufficient to persuade you too, but it wasn't.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Sep 25, 2014 at 2:37 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> I'd hoped that the commit messages, and my discussion of the feature
>> were adequate.
>
> I'd hoped that my discussion was sufficient to persuade you too, but it wasn't.

I'll write user-visible docs soon, then.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Gavin Flower
Date:
On 26/09/14 08:21, Simon Riggs wrote:
> On 25 September 2014 20:11, Robert Haas <robertmhaas@gmail.com> wrote:
>
>>> My approach would be to insert an index tuple for that value into the
>>> index, but with the leaf ituple marked with an xid rather than a ctid.
>>> If someone tries to insert into the index they would see this and wait
>>> for the inserting transaction to end. The inserting transaction would
>>> then resolve what happens in the heap (insert/update) and later
>>> repoint the index tuple to the inserted/updated row version. I don't
>>> see the need for page level locking since it would definitely result
>>> in deadlocks (e.g. SQLServer).
>> I think that something like this might work, but the devil is in the
>> details.  Suppose two people try to upsert into the same table at the
>> same time.  There's one index.  If the transactions search that index
>> for conflicts first, neither sees any conflicting tuples, and both
>> proceed.  That's no good.  OK, so suppose each transaction inserts the
>> special index tuple which you mention, to lock out concurrent inserts
>> of that value, and then searches for already-existing conflicts.  Each
>> sees the other's tuple, and they deadlock.  That's no good, either.
> The test index is unique, so our to-be-inserted value exists on only
> one page, hence page locking applies while we insert it. The next
> person to insert waits for the page lock and then sees the test tuple.
>
> The page lock lasts only for the duration of the insertion of the
> ituple, not for the whole operation.
>
>> Also, I think there are other cases where we think we're going to
>> insert, so we put the special index tuple in there, but then we decide
>> to update, so we don't need the promise tuple any more, but other
>> sessions are potentially still waiting for our XID to terminate even
>> though there's no conflict any more.  I'm having a hard time bringing
>> the details of those cases to mind ATM, though.
> We make the decision to INSERT or UPDATE based upon what we find in
> the test index. If a value if there already, we assume its an UPDATE
> and go to update the row this points to. If it has been deleted we
> loop back and try again/error. If the value is not present, we insert
> the test tuple and progress as an INSERT, then loop back later to set
> the ctid. There is no case of "don't need promise id anymore". We
> would use the PK, identity or first unique index as the test index.
> There is a case where an UPSERT conflicts with an INSERT causing the
> latter to abort.
>
> Anyway, this is why we need the design more clearly exposed, so you
> can tell me I'm wrong by showing me the URL of it done right.
>
What happens if the new value(s) of the INERT/UPDATE require the page to 
be split?

I assume the mechanics of this are  catered for, but how does it affect 
locking & potential deadlocks?


Cheers,
Gavin



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Heikki Linnakangas
Date:
On 09/26/2014 12:13 AM, Peter Geoghegan wrote:
> On Thu, Sep 25, 2014 at 1:48 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> C. Internal weirdness
>> Submit C based upon earlier agreed B, submit to Dec 15 CF, major patch
>> deadline, so we can fine tune for last CF.
>> Then Heikki rewrites half your patch in a better way, you thank him
>> and then we commit. All done.
>
> I don't have a problem with Heikki or anyone else rewriting the value
> locking part of the patch, provided it meets my requirements for such
> a mechanism. Since Heikki already agreed that that standard should be
> imposed, he'd hardly take issue with it now.
>
> However, the fact is that once you actually make something like
> promise tuples meet that standard, at the very least it becomes a lot
> messier than you'd think. Heikki's final prototype "super deleted"
> tuples by setting their xmin to InvalidTransactionId. We weren't sure
> that that doesn't break some random other heapam code. Consider this,
> for example:
>
> https://github.com/postgres/postgres/blob/REL9_4_STABLE/src/backend/executor/execMain.c#L1961
>
> So that looks safe in the face of setting xmin to InvalidTransactionId
> in the way the later prototype patch did if you think about it for a
> while, but there are other places where that is less clear. In short,
> it becomes something that we have to worry about for ever, because
> "xmin cannot change without the tuple in the slot changing" is clearly
> an invariant for certain purposes. It might accidentally fail to fail
> right now, but I'm not comfortable with it.

Just to be clear: I wrote the initial patch to demonstrate what I had in 
mind, because I was not able to explain it well enough otherwise. You 
pointed out issues with it, which I then fixed. You then pointed out 
more issues, which I then fixed again.

My patch version was a proof of concept, to demonstrate that it can be 
done. What I'd like you to do now, as the patch author, is to take the 
promise tuple approach and clean it up. If the xmin stuff is ugly, 
figure out some other way to do it.

> Now, I might be convinced that that's actually the way to go. I have
> an open mind. But that will take discussion. I like that page
> hwlocking is something that many systems do (even including Oracle, I
> believe). Making big changes to nbtree is always something that
> deserves to be met with skepticism, but it is nice to have an
> implementation that lives in the head of AM.

I don't know what you mean by "in the head of AM", but IMO it would be 
far better if we can implement this outside the index AMs. Then it will 
work with any index AM.

BTW, in the discussions, you pointed out that exclusion constraints 
currently behave differently from a unique index, when two backends 
insert a tuple at the same time. With a unique index, one of them will 
fail, but one is always guaranteed to succeed. With an exclusion 
constraint, they can both fail if you're unlucky. I think the promise 
tuples would allow us to fix that, too, while we're at it. In fact, you 
might consider tackling that first, and build the new INSERT ON CONFLICT 
syntax on top of that. Basically, an INSERT to a table with an exclusion 
constraint would be the same as "INSERT ON CONFLICT throw an error". 
That would be a useful way to split this patch into two parts.

- Heikki




Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Andres Freund
Date:
On 2014-09-26 15:24:21 +0300, Heikki Linnakangas wrote:
> I don't know what you mean by "in the head of AM", but IMO it would be far
> better if we can implement this outside the index AMs. Then it will work
> with any index AM.

Also, it's the only chance to make this ever work across partitions.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Heikki Linnakangas
Date:
On 09/26/2014 03:30 PM, Andres Freund wrote:
> On 2014-09-26 15:24:21 +0300, Heikki Linnakangas wrote:
>> I don't know what you mean by "in the head of AM", but IMO it would be far
>> better if we can implement this outside the index AMs. Then it will work
>> with any index AM.
>
> Also, it's the only chance to make this ever work across partitions.

How so? Assuming there's no overlap in the partitions, you could lock 
the page in the index of the partition you're inserting to, just like 
you would insert the promise tuple to the right partition.

- Heikki




Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Andres Freund
Date:
On 2014-09-26 15:32:35 +0300, Heikki Linnakangas wrote:
> On 09/26/2014 03:30 PM, Andres Freund wrote:
> >On 2014-09-26 15:24:21 +0300, Heikki Linnakangas wrote:
> >>I don't know what you mean by "in the head of AM", but IMO it would be far
> >>better if we can implement this outside the index AMs. Then it will work
> >>with any index AM.
> >
> >Also, it's the only chance to make this ever work across partitions.
> 
> How so? Assuming there's no overlap in the partitions, you could lock the
> page in the index of the partition you're inserting to, just like you would
> insert the promise tuple to the right partition.

Well, the 'no overlap' case is boring. At least if you mean that each
partition has disctinct value ranges in the index?

And the reason that the buffer locking approach in the overlapping case
is that you'd need to hold a large number of pages locked at the same
time. Right?

But primarily I mean that bulk of the uniqueness checking logic has to
live outside the individual AMs. It doesn't sound enticing to reach from
inside one AM into another partitions index to do stuff.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Heikki Linnakangas
Date:
On 09/26/2014 03:40 PM, Andres Freund wrote:
> On 2014-09-26 15:32:35 +0300, Heikki Linnakangas wrote:
>> On 09/26/2014 03:30 PM, Andres Freund wrote:
>>> On 2014-09-26 15:24:21 +0300, Heikki Linnakangas wrote:
>>>> I don't know what you mean by "in the head of AM", but IMO it would be far
>>>> better if we can implement this outside the index AMs. Then it will work
>>>> with any index AM.
>>>
>>> Also, it's the only chance to make this ever work across partitions.
>>
>> How so? Assuming there's no overlap in the partitions, you could lock the
>> page in the index of the partition you're inserting to, just like you would
>> insert the promise tuple to the right partition.
>
> Well, the 'no overlap' case is boring.

Ok.

> At least if you mean that each partition has disctinct value ranges
> in the index?

Right.

> And the reason that the buffer locking approach in the overlapping case
> is that you'd need to hold a large number of pages locked at the same
> time. Right?

Yeah, you would. To be honest, I didn't even think about the overlapping 
case, I just assumed that the overlapping case is the typical one and 
only thought about that.

> But primarily I mean that bulk of the uniqueness checking logic has to
> live outside the individual AMs. It doesn't sound enticing to reach from
> inside one AM into another partitions index to do stuff.

Yeah, that's a non-starter. Even with the index locking stuff, though, 
it wouldn't be the AM's responsibility to reach out to other partitions.

- Heikki




Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Andres Freund
Date:
On 2014-09-26 15:58:17 +0300, Heikki Linnakangas wrote:
> On 09/26/2014 03:40 PM, Andres Freund wrote:
> >And the reason that the buffer locking approach in the overlapping case
> >is that you'd need to hold a large number of pages locked at the same
> >time. Right?
> 
> Yeah, you would. To be honest, I didn't even think about the overlapping
> case, I just assumed that the overlapping case is the typical one and only
> thought about that.

I think it's actually quite common to want to have uniqueness constraint
overlapping partitions. Consider e.g. partitioning on the username. You
might still want to ensure emails are unique.

> >But primarily I mean that bulk of the uniqueness checking logic has to
> >live outside the individual AMs. It doesn't sound enticing to reach from
> >inside one AM into another partitions index to do stuff.
> 
> Yeah, that's a non-starter. Even with the index locking stuff, though, it
> wouldn't be the AM's responsibility to reach out to other partitions.

I'm thinking of the way btree currently does uniqueness checks. Unless
you move a large chunk of that out of the AM you'll have a hard time
building anything crossing partitions based on it. At least I can't see
how.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Fri, Sep 26, 2014 at 5:24 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> Just to be clear: I wrote the initial patch to demonstrate what I had in
> mind, because I was not able to explain it well enough otherwise. You
> pointed out issues with it, which I then fixed. You then pointed out more
> issues, which I then fixed again.

> My patch version was a proof of concept, to demonstrate that it can be done.

Right. It was a rough prototype built to prove a point. It also served
to show what I was talking about as regards deadlocks (and how the
locks could problematically persist in other ways), which I was
previously unable to effectively explain to Andres. So it was a very
useful exercise, and I wish we did that kind of thing more frequently.
But at the same time, I don't want to hold you to that prototype, or
misrepresent that prototype as showing your final position on any
technical issue. So please correct me if I do that. I've tried to be
careful about that.

> What I'd like you to do now, as the patch author, is to take the promise
> tuple approach and clean it up. If the xmin stuff is ugly, figure out some
> other way to do it.

My concern with the xmin stuff is not that it's ugly; it's that it's
potentially dangerous. It isn't at all easy to reason about where bugs
might appear - lots of things could interact with it in unpredictable
ways. I think we'd have to audit a lot of code, all over the place,
just to make sure nowhere had an assumption broken. This is a big
issue. You are asking me to find a way to save a design that I don't
particularly believe in. That might change, but right now I'm afraid
that that's the reality. Whereas, my design is entirely contained in
the file nbtinsert.c.

> I don't know what you mean by "in the head of AM", but IMO it would be far
> better if we can implement this outside the index AMs. Then it will work
> with any index AM.

I mean that "value locking" is an abstraction that lives in the head
of amcanunique AMs. That kind of encapsulation has considerable value
in reducing the risk of bugs. If what I've done has bugs, there isn't
that many places that could expose interactions with other complicated
code. There are fewer moving parts. It's a generalization of the
existing mechanism for unique index enforcement. Plus, database
systems have used heavyweight index locks for this kind of thing since
the 1970s. That's how this works everywhere else (SQL server certainly
does this for MERGE [1], and only grabs the page-level lock for a
second at lower isolation levels, as in my implementation). I think
that that ought to count for something.

I will be frank. Everyone knows that the nbtree locking parts of this
are never going to be committed over your objections. It cannot
happen. And yet, I persist in proposing that we go that way. I may be
stubborn, but I am not so stubborn that I'd jeopardize all the work
I've put into this to save one aspect of it that no one really cares
about anyway (even I only care about meeting my goals for user visible
behavior [2]). I may actually come up with a better way to make what
you outline work; then again, I may not. I have no idea, to be honest.
It's pretty clear that I'm going to have a hard time getting your
basic approach to value locking accepted without rethinking it a lot,
though. Can you really say that you won't have serious misgivings
about something like the "tuple->xmin = InvalidTransactionId"
swapping, if I actually formally propose it? That's very invasive to a
lot of places. And right now, I have no idea how we could do better.

I really only want to get to where we have a design that's acceptable.
In all sincerity, I may yet be convinced to go your way. It's possible
that I've failed to fully understand your concerns. Is it really just
about making INSERT ... ON CONFLICT IGNORE work with exclusion
constraints (UPDATE clearly makes little sense)?

> Basically, an INSERT to a table with an exclusion constraint would be the
> same as "INSERT ON CONFLICT throw an error". That would be a useful way to
> split this patch into two parts.

I'll think about it. I don't want to do that until I see a way to make
your approach to value locking work in a way that someone is actually
going to be comfortable committing. I am looking for one.

By the way, IMO stress testing has a very useful role to play in the
development of this feature. I've been doing things like trying to
flush out races by running long stress tests with random delays
artificially added at key points. I would like to make that part of
the testing strategy public and transparent.

[1] http://weblogs.sqlteam.com/mladenp/archive/2007/08/03/60277.aspx
[2] Slide 8, "Goals for UPSERT in Postgres":
http://www.pgcon.org/2014/schedule/attachments/327_upsert_weird.pdf
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Fri, Sep 26, 2014 at 5:58 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
>> And the reason that the buffer locking approach in the overlapping case
>> is that you'd need to hold a large number of pages locked at the same
>> time. Right?
>
>
> Yeah, you would. To be honest, I didn't even think about the overlapping
> case, I just assumed that the overlapping case is the typical one and only
> thought about that.

I'm not sure that I follow. Unique constraints don't work across
partitions today. Why should this work across partitions in the most
general case? Simply because there'd have to be one page lock held per
unique index/partition, where promise tuples are somewhat like row
locks, so presumably only one lock table entry is required?

In other database systems with better partitioning support, there is
such a thing as indexes that apply across all partitions ("global
indexes"). There are also "local indexes", that can only be unique if
that comports with the partitioning key in a way that makes sense. But
we don't have anything like global indexes, and even in those other
systems there are huge caveats around MERGE and its impact on global
indexes (they are automatically *marked unusable* by an SQL MERGE
command). So I think making what you have in mind here work for
current Postgres partitioning is totally unrealistic, unless (at the
very least) someone also goes and writes a global index feature, which
is obviously an enormous project.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Alvaro Herrera
Date:
Peter Geoghegan wrote:

> Can you really say that you won't have serious misgivings
> about something like the "tuple->xmin = InvalidTransactionId"
> swapping, if I actually formally propose it? That's very invasive to a
> lot of places. And right now, I have no idea how we could do better.

FWIW there are 28 callers of HeapTupleHeaderGetXmin.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Fri, Sep 26, 2014 at 3:11 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> FWIW there are 28 callers of HeapTupleHeaderGetXmin.

31 by my count, though that difference hardly matters. A lot of those
callers are in parts of the code that I don't know well. For example,
CheckForSerializableConflictOut().

Don't forget about direct callers to HeapTupleHeaderGetRawXmin(),
though. There are plenty of those in tqual.c.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Fri, Sep 26, 2014 at 3:25 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Fri, Sep 26, 2014 at 3:11 PM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
>> FWIW there are 28 callers of HeapTupleHeaderGetXmin.

> Don't forget about direct callers to HeapTupleHeaderGetRawXmin(),
> though. There are plenty of those in tqual.c.

Which reminds me: commit 37484ad2 added the opportunistic freezing
stuff. To quote the commit message:

"""
Instead of changing the tuple xmin to FrozenTransactionId, the combination
of HEAP_XMIN_COMMITTED and HEAP_XMIN_INVALID, which were previously never
set together, is now defined as HEAP_XMIN_FROZEN.  A variety of previous
proposals to freeze tuples opportunistically before vacuum_freeze_min_age
is reached have foundered on the objection that replacing xmin by
FrozenTransactionId might hinder debugging efforts when things in this
area go awry; this patch is intended to solve that problem by keeping
the XID around (but largely ignoring the value to which it is set).

"""

Why wouldn't the same objection (the objection that the earlier
opportunistic freezing ideas stalled on) apply to directly setting
tuple xmin to InvalidTransactionId?

You get the idea, though: Making promise tuples possible to release
early (before transaction end) by setting tuple xmin to
InvalidTransactionId is certainly hard to get right, and seems
dangerous.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Sep 25, 2014 at 1:48 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> At this stage, poll the Django and Rails communities for acceptance
> and early warning of these features. Listen.

FYI, I have asked for input from the Django developers here:

https://groups.google.com/forum/#!topic/django-developers/hdzkoLYVjBY

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 27 September 2014 23:23, Peter Geoghegan <pg@heroku.com> wrote:
> On Thu, Sep 25, 2014 at 1:48 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> I hate the fact
>> that you have written no user facing documentation for this feature.
>
> Attached patch adds a commit to the existing patchset. For the
> convenience of reviewers, I've uploaded and made publicly accessible a
> html build of the documentation. This page is of most interest:
>
> http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html

My request was for the following...

Agree command semantics by producing these things
* Explanatory documentation (Ch6.4 Data Manipulation - Upsert)
* SQL Reference Documentation (INSERT)
* Test cases for feature
* Test cases for concurrency
* Test cases for pgbench

because it forces you to show in detail how the command works. Adding
a few paragraphs to the INSERT page with two quick examples is not the
same level of detail at all and leaves me with the strong impression
my input has been assessed as ON CONFLICT IGNORE.

Examples of the following are needed

"ON CONFLICT UPDATE optionally accepts a WHERE clause condition. When
provided, the statement only procedes with updating if the condition
is satisfied. Otherwise, unlike a conventional UPDATE, the row is
still locked for update. Note that the condition is evaluated last,
after a conflict has been identified as a candidate to update."
Question arising: do you need to specify location criteria, or is this
an additional filter? When/why would we want that?

"Failure to anticipate and prevent would-be unique violations
originating in some other unique index than the single unique index
that was anticipated as the sole source of would-be uniqueness
violations can result in updating a row other than an existing row
with conflicting values (if any)."
In English, please

How would you do "if colA = 3 then ignore else update"?

No explanation of why the CONFLICTING() syntax differs from OLD./NEW.
syntax used in triggers

The page makes no mention of the upsert problem, nor is any previous
code mentioned.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Sat, Sep 27, 2014 at 11:21 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> My request was for the following...
>
> Agree command semantics by producing these things
> * Explanatory documentation (Ch6.4 Data Manipulation - Upsert)

Do you really think I could get an entire chapter out of this?

> * SQL Reference Documentation (INSERT)
> * Test cases for feature
> * Test cases for concurrency

All of these were added. There are two new sets of isolation tests,
one per variant of the new clause (IGNORE/UPDATE).

> * Test cases for pgbench

They're not part of the patch proper, but as I've already mentioned I
have pgbench based stress-tests on Github. There is a variety of
test-cases that test the feature under high concurrency:

https://github.com/petergeoghegan/upsert

> Examples of the following are needed
>
> "ON CONFLICT UPDATE optionally accepts a WHERE clause condition.

Yes, I realized I missed an example of that one the second I hit
"send". The MVCC interactions of this are discussed within
transaction-iso.html, FWIW.

> Question arising: do you need to specify location criteria, or is this
> an additional filter? When/why would we want that?

It is an additional way to specify a predicate/condition to UPDATE on.
There might be a kind of redundancy, if you decided to repeat the
constrained values in the predicate too, but if you're using the WHERE
clause sensibly there shouldn't be. So your UPDATE's "full predicate"
is sort of the union of the constrained values that the conflict path
was taken for, plus whatever you put in the WHERE clause, but not
quite because they're evaluated at different times (as explained
within transaction-iso.html).

> How would you do "if colA = 3 then ignore else update"?

Technically, you can't do that exact thing. IGNORE is just for quickly
dealing with ETL-type problems (and it is reasonable to use it without
one particular unique index in mind, unlike ON CONFLICT UPDATE) -
think pgloader. But if you did this:

INSERT INTO tab(colB) values('foo') ON CONFLICT UPDATE set colB =
CONFLICTING(colB) WHERE colA != 3

Then you would achieve almost the same thing. You wouldn't have
inserted or updated anything if the only rows considered had a colA of
3, but any such rows considered would be locked, which isn't the same
as IGNOREing them.

> No explanation of why the CONFLICTING() syntax differs from OLD./NEW.
> syntax used in triggers

Why should it be the same?

> The page makes no mention of the upsert problem, nor is any previous
> code mentioned.

What's the upsert problem? I mean, apart from the fact that we don't
have it. Note that it is documented that one of the two outcomes is
guaranteed.

I should have updated the plpgsql looping subxact example, though.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Andreas Karlsson
Date:
On 09/28/2014 09:40 AM, Peter Geoghegan wrote:
>> No explanation of why the CONFLICTING() syntax differs from OLD./NEW.
>> syntax used in triggers
>
> Why should it be the same?

Both can be seen as cases where you refer to a field of a tuple, which 
is usually done with FOO.bar.

-- 
Andreas Karlsson



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Craig Ringer
Date:
On 09/28/2014 03:40 PM, Peter Geoghegan wrote:
> Do you really think I could get an entire chapter out of this?

Yes. It might be a short chapter, but once you extract the existing
upsert example from the docs and how why the naïve approach doesn't work
there'll be enough to go on.

People get this wrong a *lot*.

http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING
http://www.depesz.com/2012/06/10/why-is-upsert-so-complicated/
http://stackoverflow.com/q/17267417/398670
http://stackoverflow.com/q/1109061/398670

I'm happy to help with documenting it.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 28 September 2014 08:40, Peter Geoghegan <pg@heroku.com> wrote:
> On Sat, Sep 27, 2014 at 11:21 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> My request was for the following...
>>
>> Agree command semantics by producing these things
>> * Explanatory documentation (Ch6.4 Data Manipulation - Upsert)

...

> INSERT INTO tab(colB) values('foo') ON CONFLICT UPDATE set colB =
> CONFLICTING(colB) WHERE colA != 3
>
> Then you would achieve almost the same thing. You wouldn't have
> inserted or updated anything if the only rows considered had a colA of
> 3, but any such rows considered would be locked, which isn't the same
> as IGNOREing them.
>
>> No explanation of why the CONFLICTING() syntax differs from OLD./NEW.
>> syntax used in triggers
>
> Why should it be the same?

Good question. What could be wrong with making up new syntax?

The obvious answer is because we would simply have nothing to guide
us. No principles that can be applied, just opinions.

My considered opinion is that the above syntax is
* non-standard
* inconsistent with what we have elsewhere
* an additional item for implementors to handle

I could use more emotive words here, but the above should suffice to
cover my unease at inventing new SQL constructs. This is Postgres.

What worries me the most is that ORM implementors everywhere will
simply ignore our efforts, leaving us with something we'd much rather
we didn't have. As a possible committer of this feature, I would not
wish to put my name to that. You will need one a committer who will do
that.

Which brings me back to the SQL Standard, which is MERGE. We already
know the MERGE command does not fully and usefully define its
concurrent behaviour; I raised this 6 years ago. It's not clear to me
that that we couldn't more closely define the behaviour for a subset
of the command.

If we implement MERGE, then we will help ORM developers do less work
to support Postgres, which will encourage adoption.

My proposal would be to implement only a very limited syntax for MERGE
in this release, replacing this

> INSERT INTO tab(colB) values('foo') ON CONFLICT UPDATE set colB =
> CONFLICTING(colB) WHERE colA != 3

with this...

MERGE INTO tab USING VALUES ('foo')
WHEN NOT MATCHED THENINSERT (colB)
WHEN MATCHED THENUPDATE SET colB = NEW.p1

and throwing "ERROR: full syntax for MERGE not implemented yet" if
people stretch too far.

If there is some deviation from the standard, it can be explained
clearly, though I don't see we would need to do that - we can extend
beyond the standard to explain the concurrent behaviour. And we will
be a lot closer to getting full MERGE also.

Doing MERGE syntax is probably about 2 weeks work, which is better
than 2 weeks per ORM to support the new Postgres-only syntax.

Thanks for your efforts to bring this to a conclusion.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Sun, Sep 28, 2014 at 1:17 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> MERGE INTO tab USING VALUES ('foo')
> WHEN NOT MATCHED THEN
>  INSERT (colB)
> WHEN MATCHED THEN
>  UPDATE SET colB = NEW.p1
>
> and throwing "ERROR: full syntax for MERGE not implemented yet" if
> people stretch too far.

That isn't the MERGE syntax either. Where is the join?

I've extensively discussed why I think we should avoid calling
something upsert-like MERGE, as you know:

http://www.postgresql.org/message-id/flat/CAM3SWZRP0c3g6+aJ=YYDGYAcTZg0xA8-1_FCVo5Xm7hrEL34kw@mail.gmail.com#CAM3SWZRP0c3g6+aJ=YYDGYAcTZg0xA8-1_FCVo5Xm7hrEL34kw@mail.gmail.com

We *should* have a MERGE feature, but one that serves the actual MERGE
use-case well. That is an important use-case; it just isn't the one
I'm interested in right now.

FWIW, I agree that it wouldn't be much work to do this - what you
present here really is just a different syntax for what I have here
(which isn't MERGE). I think it would be counter-productive to pursue
this, though. Also, what about limiting the unique indexes under
consideration?

There was informal meeting of this at the dev meeting a in 2012.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Sun, Sep 28, 2014 at 1:31 PM, Peter Geoghegan <pg@heroku.com> wrote:
> There was informal meeting of this at the dev meeting a in 2012.

I mean: There was informal agreement that as long as we're working on
a feature that makes useful, UPSERT-like guarantees, we shouldn't use
the MERGE syntax. MERGE clearly benefits (in ways only relevant to the
use-case it targets) from having the leeway to not care about what
someone with the UPSERT use-case would call race conditions.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Gavin Flower
Date:
On 29/09/14 09:31, Peter Geoghegan wrote:
> On Sun, Sep 28, 2014 at 1:17 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> MERGE INTO tab USING VALUES ('foo')
>> WHEN NOT MATCHED THEN
>>   INSERT (colB)
>> WHEN MATCHED THEN
>>   UPDATE SET colB = NEW.p1
>>
>> and throwing "ERROR: full syntax for MERGE not implemented yet" if
>> people stretch too far.
> That isn't the MERGE syntax either. Where is the join?
>
> I've extensively discussed why I think we should avoid calling
> something upsert-like MERGE, as you know:
>
http://www.postgresql.org/message-id/flat/CAM3SWZRP0c3g6+aJ=YYDGYAcTZg0xA8-1_FCVo5Xm7hrEL34kw@mail.gmail.com#CAM3SWZRP0c3g6+aJ=YYDGYAcTZg0xA8-1_FCVo5Xm7hrEL34kw@mail.gmail.com
>
> We *should* have a MERGE feature, but one that serves the actual MERGE
> use-case well. That is an important use-case; it just isn't the one
> I'm interested in right now.
>
> FWIW, I agree that it wouldn't be much work to do this - what you
> present here really is just a different syntax for what I have here
> (which isn't MERGE). I think it would be counter-productive to pursue
> this, though. Also, what about limiting the unique indexes under
> consideration?
>
> There was informal meeting of this at the dev meeting a in 2012.
>
How about have a stub page for MERGE, saying it is not implemented yet, 
but how about considering UPSERT - or something of that nature?

I can suspect that people are much more likely to look for 'MERGE' in an 
index, or look for 'MERGE' in the list of SQL commands, than 'UPSERT'.


Cheers,
Gavin



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Sun, Sep 28, 2014 at 3:41 PM, Gavin Flower
<GavinFlower@archidevsys.co.nz> wrote:
> How about have a stub page for MERGE, saying it is not implemented yet, but
> how about considering UPSERT - or something of that nature?
>
> I can suspect that people are much more likely to look for 'MERGE' in an
> index, or look for 'MERGE' in the list of SQL commands, than 'UPSERT'.

Seems reasonable.

What I have a problem with is using the MERGE syntax to match people's
preexisting confused ideas about what MERGE does. If we do that, it'll
definitely bite us when we go to make what we'd be calling MERGE do
what MERGE is actually supposed to do. I favor clearly explaining
that.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Gavin Flower
Date:
On 29/09/14 11:57, Peter Geoghegan wrote:
> On Sun, Sep 28, 2014 at 3:41 PM, Gavin Flower
> <GavinFlower@archidevsys.co.nz> wrote:
>> How about have a stub page for MERGE, saying it is not implemented yet, but
>> how about considering UPSERT - or something of that nature?
>>
>> I can suspect that people are much more likely to look for 'MERGE' in an
>> index, or look for 'MERGE' in the list of SQL commands, than 'UPSERT'.
> Seems reasonable.
>
> What I have a problem with is using the MERGE syntax to match people's
> preexisting confused ideas about what MERGE does. If we do that, it'll
> definitely bite us when we go to make what we'd be calling MERGE do
> what MERGE is actually supposed to do. I favor clearly explaining
> that.
>
Opinionated I may be, but I wanted stay well clear of the syntax 
minefield in this area - as I still have at least a vestigial instinct 
for self preservation!  :-)



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Sun, Sep 28, 2014 at 6:15 PM, Gavin Flower
<GavinFlower@archidevsys.co.nz> wrote:
>> What I have a problem with is using the MERGE syntax to match people's
>> preexisting confused ideas about what MERGE does. If we do that, it'll
>> definitely bite us when we go to make what we'd be calling MERGE do
>> what MERGE is actually supposed to do. I favor clearly explaining
>> that.
>>
> Opinionated I may be, but I wanted stay well clear of the syntax minefield
> in this area - as I still have at least a vestigial instinct for self
> preservation!  :-)

To be clear: I don't think Simon is confused about this at all, which
is why I'm surprised that he suggested it.


-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Gavin Flower
Date:
On 29/09/14 14:20, Peter Geoghegan wrote:
> On Sun, Sep 28, 2014 at 6:15 PM, Gavin Flower
> <GavinFlower@archidevsys.co.nz> wrote:
>>> What I have a problem with is using the MERGE syntax to match people's
>>> preexisting confused ideas about what MERGE does. If we do that, it'll
>>> definitely bite us when we go to make what we'd be calling MERGE do
>>> what MERGE is actually supposed to do. I favor clearly explaining
>>> that.
>>>
>> Opinionated I may be, but I wanted stay well clear of the syntax minefield
>> in this area - as I still have at least a vestigial instinct for self
>> preservation!  :-)
> To be clear: I don't think Simon is confused about this at all, which
> is why I'm surprised that he suggested it.
>
>
More specifically, I have only lightly read this thread - and while I 
think the functionality is useful, I have not thought about it any real 
depth.  I was thinking more along the lines that if I needed 
functionality like this, where & how might I look for it.

I was remembering my problems looking up syntax in COBOL after coming 
from FORTRAN (& other languages) - some concepts had different names and 
the philosophy was significantly different in places.  The relevance 
here, is that peoples' background in other DBMS & knowledge of SQL 
standards affect what they expect, as well as preventing unnecessary 
conflicts between PostgreSQL & SQL standards (as far as is practicable & 
sensible).





Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Craig Ringer
Date:
On 09/29/2014 06:41 AM, Gavin Flower wrote:
> 
> I can suspect that people are much more likely to look for 'MERGE' in an
> index, or look for 'MERGE' in the list of SQL commands, than 'UPSERT'.

and/or to be looking for MySQL's:
 ON DUPLICATE KEY {IGNORE|UPDATE}


What astonishes me when I look around at how other RDBMS users solve
this is how many of them completely ignore concurrency issues. e.g. in
this SO question:

http://stackoverflow.com/q/108403/398670

there's an alarming lack of concern for concurrency, just a couple of
links to :

http://www.mssqltips.com/sqlservertip/3074/use-caution-with-sql-servers-merge-statement/

(BTW, that article contains some useful information about corner cases
any upsert approach should test and deal with).


Similar with Oracle: Alarming lack of concern for concurrency among users:

http://stackoverflow.com/q/237327/398670

Useful article:

http://michaeljswart.com/2011/09/mythbusting-concurrent-updateinsert-solutions/

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Sun, Sep 28, 2014 at 8:53 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> there's an alarming lack of concern for concurrency, just a couple of
> links to :
>
> http://www.mssqltips.com/sqlservertip/3074/use-caution-with-sql-servers-merge-statement/
>
> (BTW, that article contains some useful information about corner cases
> any upsert approach should test and deal with).

Did you find some of those links from my pgCon slides, or
independently? I'm well aware of those issues, FWIW. Avoiding
repeating the mistakes of others is something that I thought about
from an early stage.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Craig Ringer
Date:
On 09/29/2014 12:03 PM, Peter Geoghegan wrote:
> On Sun, Sep 28, 2014 at 8:53 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
>> there's an alarming lack of concern for concurrency, just a couple of
>> links to :
>>
>> http://www.mssqltips.com/sqlservertip/3074/use-caution-with-sql-servers-merge-statement/
>>
>> (BTW, that article contains some useful information about corner cases
>> any upsert approach should test and deal with).
> 
> Did you find some of those links from my pgCon slides, or
> independently? I'm well aware of those issues, FWIW. Avoiding
> repeating the mistakes of others is something that I thought about
> from an early stage.

Independently. I'm very glad to see you've looked over those issues.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Heikki Linnakangas
Date:
On 09/28/2014 11:31 PM, Peter Geoghegan wrote:
> On Sun, Sep 28, 2014 at 1:17 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> MERGE INTO tab USING VALUES ('foo')
>> WHEN NOT MATCHED THEN
>>   INSERT (colB)
>> WHEN MATCHED THEN
>>   UPDATE SET colB = NEW.p1
>>
>> and throwing "ERROR: full syntax for MERGE not implemented yet" if
>> people stretch too far.
>
> That isn't the MERGE syntax either. Where is the join?
>
> I've extensively discussed why I think we should avoid calling
> something upsert-like MERGE, as you know:
>
http://www.postgresql.org/message-id/flat/CAM3SWZRP0c3g6+aJ=YYDGYAcTZg0xA8-1_FCVo5Xm7hrEL34kw@mail.gmail.com#CAM3SWZRP0c3g6+aJ=YYDGYAcTZg0xA8-1_FCVo5Xm7hrEL34kw@mail.gmail.com
>
> We *should* have a MERGE feature, but one that serves the actual MERGE
> use-case well. That is an important use-case; it just isn't the one
> I'm interested in right now.

I agree we should not use the MERGE keyword for this. The upsert feature 
has tighter concurrency requirements than the SQL MERGE command, and 
that might come back to bite us. It would be highly confusing if some 
variants of MERGE are concurrency-safe and others are not, but if we now 
promise that our MERGE command is always concurrency-safe, that promise 
might be difficult to keep for the full MERGE syntax, and for whatever 
extensions the SQL committee comes up in the future.

That said, it would be handy if the syntax was closer to MERGE. Aside 
from the concurrency issues, it does the same thing, right? So how about 
making the syntax identical to MERGE, except for swapping the MERGE 
keyword with e.g. UPSERT?

- Heikki




Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Andres Freund
Date:
On 2014-09-29 09:51:45 +0300, Heikki Linnakangas wrote:
> That said, it would be handy if the syntax was closer to MERGE. Aside from
> the concurrency issues, it does the same thing, right? So how about making
> the syntax identical to MERGE, except for swapping the MERGE keyword with
> e.g. UPSERT?

I don't think that's a good idea. What most people are missing is an
*easy* way to do upsert, that's similar to the normal INSERT. Not
something with a pretty different syntax. That's why INSERT OR REPLACE
and stuff like that was well adopted.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 28 September 2014 21:31, Peter Geoghegan <pg@heroku.com> wrote:
> On Sun, Sep 28, 2014 at 1:17 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> MERGE INTO tab USING VALUES ('foo')
>> WHEN NOT MATCHED THEN
>>  INSERT (colB)
>> WHEN MATCHED THEN
>>  UPDATE SET colB = NEW.p1
>>
>> and throwing "ERROR: full syntax for MERGE not implemented yet" if
>> people stretch too far.

> I've extensively discussed why I think we should avoid calling
> something upsert-like MERGE, as you know:
>
http://www.postgresql.org/message-id/flat/CAM3SWZRP0c3g6+aJ=YYDGYAcTZg0xA8-1_FCVo5Xm7hrEL34kw@mail.gmail.com#CAM3SWZRP0c3g6+aJ=YYDGYAcTZg0xA8-1_FCVo5Xm7hrEL34kw@mail.gmail.com
>
> We *should* have a MERGE feature, but one that serves the actual MERGE
> use-case well. That is an important use-case; it just isn't the one
> I'm interested in right now.
>
> FWIW, I agree that it wouldn't be much work to do this - what you
> present here really is just a different syntax for what I have here
> (which isn't MERGE). I think it would be counter-productive to pursue
> this, though. Also, what about limiting the unique indexes under
> consideration?
>
> There was informal meeting of this at the dev meeting a in 2012.

I agreed with the initial proposition to go for a different syntax.

Now that I see the new syntax, I have changed my mind. The new syntax
is much worse, I am sorry to say.

MERGE standard does not offer guidance on concurrent effects, but
there is no confusion as to how it works. We can impose our own
concurrency rules since those are not covered by the standard. These
are quite clear for single row inputs anyway, i.e. a VALUES clause.

>
> That isn't the MERGE syntax either. Where is the join?
>

There doesn't need to be one. INSERT assumes that if a column list is
not mentioned then the VALUES clause is joined directly to the table,
so we can do the same thing for MERGE.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 29 September 2014 02:20, Peter Geoghegan <pg@heroku.com> wrote:
> On Sun, Sep 28, 2014 at 6:15 PM, Gavin Flower
> <GavinFlower@archidevsys.co.nz> wrote:
>>> What I have a problem with is using the MERGE syntax to match people's
>>> preexisting confused ideas about what MERGE does. If we do that, it'll
>>> definitely bite us when we go to make what we'd be calling MERGE do
>>> what MERGE is actually supposed to do. I favor clearly explaining
>>> that.
>>>
>> Opinionated I may be, but I wanted stay well clear of the syntax minefield
>> in this area - as I still have at least a vestigial instinct for self
>> preservation!  :-)
>
> To be clear: I don't think Simon is confused about this at all, which
> is why I'm surprised that he suggested it.

At this point, I started to discuss MERGE again, but let me stop
because there is a wider issue.

These threads are littered with references that go nowhere. Links back
to an email where you said the same thing two years ago are not proof
that its a bad idea. You need to carefully explain things in detail in
one place to allow people to make up their own minds, not just
re-assert it endlessly and claim 3 friends also agree, while everyone
else searches desperately for what the actual reasons are. Lists of
problems with MERGE statement, with examples are what is needed to
convince and keep us convinced. Then full documentation on the
proposed solution, so we can see that also.

Please go to some trouble to tidy things up so we have clarity that
*we* can see and decide for ourselves whether or not you are correct.

Thanks

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 29 September 2014 08:02, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-09-29 09:51:45 +0300, Heikki Linnakangas wrote:
>> That said, it would be handy if the syntax was closer to MERGE. Aside from
>> the concurrency issues, it does the same thing, right? So how about making
>> the syntax identical to MERGE, except for swapping the MERGE keyword with
>> e.g. UPSERT?
>
> I don't think that's a good idea. What most people are missing is an
> *easy* way to do upsert, that's similar to the normal INSERT. Not
> something with a pretty different syntax. That's why INSERT OR REPLACE
> and stuff like that was well adopted.

We have 3 choices...

1. SQL Standard MERGE (or a subset)
2. MySQL Compatible syntax
3. Something completely different

If we go for (3), I would like to see a long and detailed explanation
of what is wrong with (1) and (2) before we do (3). That needs to be
clear, detailed, well researched, correct and agreed. Otherwise when
we release such a feature, people will ask, why did you do that? And
yet nobody will remember.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Craig Ringer
Date:
On 09/29/2014 05:10 PM, Simon Riggs wrote:
> 
> Please go to some trouble to tidy things up so we have clarity that
> *we* can see and decide for ourselves whether or not you are correct.

Are you suggesting a wiki page to document the issues, discussions
around each issue, etc? A summary mail? Something else?

We have https://wiki.postgresql.org/wiki/SQL_MERGE but it's outdated,
pretty sparse, and not really about the current work.


-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 29 September 2014 10:27, Craig Ringer <craig@2ndquadrant.com> wrote:
> On 09/29/2014 05:10 PM, Simon Riggs wrote:
>>
>> Please go to some trouble to tidy things up so we have clarity that
>> *we* can see and decide for ourselves whether or not you are correct.
>
> Are you suggesting a wiki page to document the issues, discussions
> around each issue, etc? A summary mail? Something else?

Something that can be edited to keep it up to date, yes.

> We have https://wiki.postgresql.org/wiki/SQL_MERGE but it's outdated,
> pretty sparse, and not really about the current work.

I rest my case.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 28 September 2014 08:40, Peter Geoghegan <pg@heroku.com> wrote:
> On Sat, Sep 27, 2014 at 11:21 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> My request was for the following...
>>
>> Agree command semantics by producing these things
>> * Explanatory documentation (Ch6.4 Data Manipulation - Upsert)
>
> Do you really think I could get an entire chapter out of this?

If you were an ORM developer reading the PostgreSQL Release Notes for
9.5, which URL would you visit to see a complete description of the
new feature, including how it works concurrently, locking and other
aspects. How would you check whether some strange behaviour was a bug,
or intentional?

The new docs are scattered across many pages and there are very few
examples. It was very difficult to read like that.


>> * SQL Reference Documentation (INSERT)
>> * Test cases for feature
>> * Test cases for concurrency
>
> All of these were added. There are two new sets of isolation tests,
> one per variant of the new clause (IGNORE/UPDATE).

When you say "added", what do you mean? You posted one new doc patch,
with no tests in it.

>> Question arising: do you need to specify location criteria, or is this
>> an additional filter? When/why would we want that?
>
> It is an additional way to specify a predicate/condition to UPDATE on.
> There might be a kind of redundancy, if you decided to repeat the
> constrained values in the predicate too, but if you're using the WHERE
> clause sensibly there shouldn't be. So your UPDATE's "full predicate"
> is sort of the union of the constrained values that the conflict path
> was taken for, plus whatever you put in the WHERE clause, but not
> quite because they're evaluated at different times (as explained
> within transaction-iso.html).

I think we should leave that out of the first commit. I'm not sure why
that exists. If you wish to push down that route, then I recommend
using the MERGE syntax because it caters for this much better than
this.


>> How would you do "if colA = 3 then ignore else update"?
>
> Technically, you can't do that exact thing. IGNORE is just for quickly
> dealing with ETL-type problems (and it is reasonable to use it without
> one particular unique index in mind, unlike ON CONFLICT UPDATE) -
> think pgloader. But if you did this:
>
> INSERT INTO tab(colB) values('foo') ON CONFLICT UPDATE set colB =
> CONFLICTING(colB) WHERE colA != 3
>
> Then you would achieve almost the same thing. You wouldn't have
> inserted or updated anything if the only rows considered had a colA of
> 3, but any such rows considered would be locked, which isn't the same
> as IGNOREing them.
>
>> No explanation of why the CONFLICTING() syntax differs from OLD./NEW.
>> syntax used in triggers
>
> Why should it be the same?

Because it would be a principled approach to do that.

If we aren't going to use MERGE syntax, it would make sense to at
least use the same terminology.

e.g.
INSERT ....
WHEN MATCHED
UPDATE

The concept of "matched" is identical between MERGE and UPSERT and it
will be confusing to have two words for the same thing.

There seems to be a good reason not to use the MySQL syntax of ON
DUPLICATE KEY UPDATE, which doesn't allow you to specify UPDATE
operations other than a replace, so no deltas, e.g. SET a = a + x

Having said that, it would be much nicer to have a mode that allows
you to just say the word "UPDATE" and have it copy the data into the
correct columns, like MySQL does. That is very intuitive, even if it
isn't very flexible.

>> The page makes no mention of the upsert problem, nor is any previous
>> code mentioned.
>
> What's the upsert problem? I mean, apart from the fact that we don't
> have it. Note that it is documented that one of the two outcomes is
> guaranteed.
>
> I should have updated the plpgsql looping subxact example, though.

That's what I meant.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Robert Haas
Date:
On Fri, Sep 26, 2014 at 5:40 PM, Peter Geoghegan <pg@heroku.com> wrote:
> I will be frank. Everyone knows that the nbtree locking parts of this
> are never going to be committed over your objections. It cannot
> happen. And yet, I persist in proposing that we go that way. I may be
> stubborn, but I am not so stubborn that I'd jeopardize all the work
> I've put into this to save one aspect of it that no one really cares
> about anyway (even I only care about meeting my goals for user visible
> behavior [2]). I may actually come up with a better way to make what
> you outline work; then again, I may not. I have no idea, to be honest.
> It's pretty clear that I'm going to have a hard time getting your
> basic approach to value locking accepted without rethinking it a lot,
> though. Can you really say that you won't have serious misgivings
> about something like the "tuple->xmin = InvalidTransactionId"
> swapping, if I actually formally propose it? That's very invasive to a
> lot of places. And right now, I have no idea how we could do better.
>
> I really only want to get to where we have a design that's acceptable.
> In all sincerity, I may yet be convinced to go your way. It's possible
> that I've failed to fully understand your concerns. Is it really just
> about making INSERT ... ON CONFLICT IGNORE work with exclusion
> constraints (UPDATE clearly makes little sense)?

I'll be frank, too.  Heikki doesn't need to persuade you to go his
way, because everyone other than yourself who has looked at this
problem has come up with a design that looks like his.  That includes,
but is not limited to, every committer who has looked at this.  The
burden of proof is on you to convince everyone else that the promise
tuple approach is wrong, not on everyone else to convince you that
it's right.  This is a community, and it operates by consensus.  Your
opinion, no matter how strongly held in the face of opposition, is not
a consensus.

As far as finding an option that's better than clearing the xmin, the
point is not that we'd commit that design.  Well, we might, if
somebody does a careful audit of all the relevant code paths and makes
a convincing argument that it's safe.  But more likely, somebody will
go find some other bit space that can be used to do this.  The fact
that it's not immediately obvious to you (or Heikki) where to find
that bit-space is not a principled argument for changing the whole
design.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Sep 29, 2014 at 8:31 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> I'll be frank, too.  Heikki doesn't need to persuade you to go his
> way, because everyone other than yourself who has looked at this
> problem has come up with a design that looks like his.

Andres suggested something that is very roughly comparable, perhaps.
And that was it, really, except for your suggestion that I convinced
you wasn't the best way forward (for unrelated reasons).

> As far as finding an option that's better than clearing the xmin, the
> point is not that we'd commit that design.  Well, we might, if
> somebody does a careful audit of all the relevant code paths and makes
> a convincing argument that it's safe.  But more likely, somebody will
> go find some other bit space that can be used to do this.  The fact
> that it's not immediately obvious to you (or Heikki) where to find
> that bit-space is not a principled argument for changing the whole
> design.

I never said that it was.

*Obviously* I know that Heikki is not obligated to convince me of
anything - I said as much. Whether or not Heikki is obligated to
convince me is not the point, which is that it would be nice if he
could convince me. I think that there are some serious issues with the
promise tuples approach, and discussing those brings us closer to
moving forward.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Sep 29, 2014 at 10:37 AM, Peter Geoghegan <pg@heroku.com> wrote:
> But more likely, somebody will
>> go find some other bit space that can be used to do this.

My concerns have nothing to do with the availability of bit space, obviously.


-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Sep 29, 2014 at 7:21 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> If you were an ORM developer reading the PostgreSQL Release Notes for
> 9.5, which URL would you visit to see a complete description of the
> new feature, including how it works concurrently, locking and other
> aspects. How would you check whether some strange behaviour was a bug,
> or intentional?

We don't do that with UPDATE, so why would we do it with this? There
is an existing structure to the documentation that needs to be
respected. This is the case even though the EvalPlanQual() mechanism
is a total Postgres-ism, which can potentially violate snapshot
isolation (this is not true of Oracle's READ COMMITTED, for example).
You have to go out of your way to find that out at the moment. But I
know ORM authors, and the majority probably don't understand this
stuff - that ought to be okay.

>> All of these were added. There are two new sets of isolation tests,
>> one per variant of the new clause (IGNORE/UPDATE).
>
> When you say "added", what do you mean? You posted one new doc patch,
> with no tests in it.

I mean that there was a commit (not included with the documentation,
but with the original patchset) with many tests. I don't know why
you're suggesting that I don't have "concurrency tests". There are
isolation tests in that commit. There are also many regression tests.

>> It is an additional way to specify a predicate/condition to UPDATE on.
>> There might be a kind of redundancy, if you decided to repeat the
>> constrained values in the predicate too, but if you're using the WHERE
>> clause sensibly there shouldn't be. So your UPDATE's "full predicate"
>> is sort of the union of the constrained values that the conflict path
>> was taken for, plus whatever you put in the WHERE clause, but not
>> quite because they're evaluated at different times (as explained
>> within transaction-iso.html).
>
> I think we should leave that out of the first commit. I'm not sure why
> that exists. If you wish to push down that route, then I recommend
> using the MERGE syntax because it caters for this much better than
> this.

Why leave it out? People are going to "push the predicate into the
targetlist" if I do, and the effect is exactly the same.

>>> No explanation of why the CONFLICTING() syntax differs from OLD./NEW.
>>> syntax used in triggers
>>
>> Why should it be the same?
>
> Because it would be a principled approach to do that.

That is just an assertion. The MERGE syntax doesn't use that either.

> If we aren't going to use MERGE syntax, it would make sense to at
> least use the same terminology.
>
> e.g.
> INSERT ....
> WHEN MATCHED
> UPDATE
>
> The concept of "matched" is identical between MERGE and UPSERT and it
> will be confusing to have two words for the same thing.

I don't care if we change the spelling to "WHEN MATCHED
UPDATE/IGNORE". That seems fine. But MERGE is talking about a join,
not the presence of a would-be duplicate violation.

> There seems to be a good reason not to use the MySQL syntax of ON
> DUPLICATE KEY UPDATE, which doesn't allow you to specify UPDATE
> operations other than a replace, so no deltas, e.g. SET a = a + x

That isn't true, actually. It clearly does.

> Having said that, it would be much nicer to have a mode that allows
> you to just say the word "UPDATE" and have it copy the data into the
> correct columns, like MySQL does. That is very intuitive, even if it
> isn't very flexible.

Multi-assignment updates (with or without CONFLICTING()) are supported, FWIW.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Sep 29, 2014 at 12:02 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-09-29 09:51:45 +0300, Heikki Linnakangas wrote:
>> That said, it would be handy if the syntax was closer to MERGE. Aside from
>> the concurrency issues, it does the same thing, right? So how about making
>> the syntax identical to MERGE, except for swapping the MERGE keyword with
>> e.g. UPSERT?
>
> I don't think that's a good idea. What most people are missing is an
> *easy* way to do upsert, that's similar to the normal INSERT. Not
> something with a pretty different syntax. That's why INSERT OR REPLACE
> and stuff like that was well adopted.

Agreed.

MERGE isn't the same other than the concurrency concerns, in any case.
It is driven by a join, which is very flexible, but also has problems
with concurrency (leaving aside the fact that in practice it doesn't
tend to work out well when it isn't an equi-join). UPSERT *has* to be
driven by something like a would-be unique violation, not an outer
join matching or not matching.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Sep 29, 2014 at 2:27 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
>> Please go to some trouble to tidy things up so we have clarity that
>> *we* can see and decide for ourselves whether or not you are correct.
>
> Are you suggesting a wiki page to document the issues, discussions
> around each issue, etc? A summary mail? Something else?

It isn't easy, Simon. I thought my big e-mail at the start of the
thread was a summary.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Sep 29, 2014 at 2:14 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> 1. SQL Standard MERGE (or a subset)
> 2. MySQL Compatible syntax
> 3. Something completely different
>
> If we go for (3), I would like to see a long and detailed explanation
> of what is wrong with (1) and (2) before we do (3). That needs to be
> clear, detailed, well researched, correct and agreed. Otherwise when
> we release such a feature, people will ask, why did you do that? And
> yet nobody will remember.

My syntax is inspired by the MySQL one, with some influence from
SQLite (SQLite have an ON CONFLICT REPLACE). I don't want to copy
MySQL's use of VALUES() in the UPDATE targetlist - I spell the same
concept as CONFLICTING(). I guess that otherwise they'd have to make
the VALUES()/CONFLICTING() expression a whole new fully reserved
keyword, and preferred not to. Also, MySQL bizarrely omits the "SET"
keyword within ON DUPLICATE KEY UPDATE. So I haven't copied it exactly
on aesthetic grounds. I think that the actual reason for the latter
wart (the SET omission) is that MySQL found it easier to write the
grammar that way. Consider what we do here to make SET in an UPDATE
work, despite the fact that it's a valid column name:

https://github.com/postgres/postgres/blob/REL9_4_STABLE/src/backend/parser/gram.y#L10141

So I wanted to suggest something similar but not identical to the
MySQL syntax, with a bit more flexibility/safety. I thought that I
could do so without emulating their warts.

As I've mentioned, it isn't the MERGE syntax because that is quite a
different thing. There is a place for it, but it's not strategically
important in the same way as upsert is.
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Kevin Grittner
Date:
Peter Geoghegan <pg@heroku.com> wrote:

> As I've mentioned, it isn't the MERGE syntax because that is
> quite a different thing. There is a place for it, but it's not
> strategically important in the same way as upsert is.

I think that the subset of the MERGE syntax that would be needed
for UPSERT behavior would be as follows.  For one row as literals:
 MERGE INTO tab t   USING (VALUES ('foo', 'p1')) new(id, colB)   ON (t.id = new.id)   WHEN MATCHED THEN     UPDATE SET
colB= new.colB   WHEN NOT MATCHED THEN     INSERT (id, colB) VALUES (new.id, new.colB);
 

If you have a bunch of rows in a "bar" table you want to merge in:
 MERGE INTO tab t   USING (SELECT id, colB FROM bar) b   ON (t.id = b.id)   WHEN MATCHED THEN     UPDATE SET colB =
b.colB  WHEN NOT MATCHED THEN     INSERT (id, colB) VALUES (b.id, b.colB);
 

I fail to see how this is harder or more problematic than the
nonstandard suggestions that have been floated.  I don't know why
we would be even *considering* a nonstandard syntax rather than
saying that only this subset is supported *so far*.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Sep 29, 2014 at 1:40 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
> I think that the subset of the MERGE syntax that would be needed
> for UPSERT behavior would be as follows.  For one row as literals:
>
>   MERGE INTO tab t
>     USING (VALUES ('foo', 'p1')) new(id, colB)
>     ON (t.id = new.id)
>     WHEN MATCHED THEN
>       UPDATE SET colB = new.colB
>     WHEN NOT MATCHED THEN
>       INSERT (id, colB) VALUES (new.id, new.colB);
>
> If you have a bunch of rows in a "bar" table you want to merge in:
>
>   MERGE INTO tab t
>     USING (SELECT id, colB FROM bar) b
>     ON (t.id = b.id)
>     WHEN MATCHED THEN
>       UPDATE SET colB = b.colB
>     WHEN NOT MATCHED THEN
>       INSERT (id, colB) VALUES (b.id, b.colB);
>
> I fail to see how this is harder or more problematic than the
> nonstandard suggestions that have been floated.  I don't know why
> we would be even *considering* a nonstandard syntax rather than
> saying that only this subset is supported *so far*.

Heikki, Andres and I are against using MERGE for this, fwiw. Tom
seemed to think so too, on previous occasions. It isn't a matter of
alternative syntaxes. I have described in detail why I think it's a
bad idea - I have linked to that about 3 times in this thread. It
paints us into a corner when we go to make this do what MERGE is
supposed to do. Do you want a feature that, when fully generalized,
plays a special visibility game based on whether or not some exact set
of conditions are met? That is a non-starter, IMV.

The whole idea of using an arbitrary join syntax seems great, but I
need something that works backwards from would-be unique violations.
That's the only way to preserve the UPSERT guarantees (atomicity,
definite insert or update).

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Kevin Grittner
Date:
Peter Geoghegan <pg@heroku.com> wrote:

> Heikki, Andres and I are against using MERGE for this, fwiw. Tom
> seemed to think so too, on previous occasions. It isn't a matter
> of alternative syntaxes. I have described in detail why I think
> it's a bad idea - I have linked to that about 3 times in this
> thread.

Yeah, I read that, and I'm not convinced.

> It paints us into a corner when we go to make this do what MERGE
> is supposed to do. Do you want a feature that, when fully
> generalized, plays a special visibility game based on whether or
> not some exact set of conditions are met? That is a non-starter,
> IMV.

For other queries we use different access techniques not only
based on the presence of an index, but on the state of the
visibility map, degree of bloat, ordering of tuples in a heap,
etc. -- so sure, I'm OK with different execution styles based on
whether your join conditions match a unique index on columns that
can't be NULL.

> The whole idea of using an arbitrary join syntax seems great,
> but I need something that works backwards from would-be unique
> violations. That's the only way to preserve the UPSERT guarantees
> (atomicity, definite insert or update).

I absolutely don't buy that it is the *only way*.  It is probably
(by far) the *easiest* way, and doing so gets us a frequently-
requested feature; but I think limiting the initial implementation
to cases where the join conditions include equality tests on all
columns of some appropriate unique index is fine, and doesn't seem
to me to preclude further development of the MERGE feature for
additional cases.  In fact, I think having something to build on is
a plus.

The claims that you can't get a duplicate key error with an UPSERT
are completely bogus, IMV.  The *best* you can do is avoid them on
the index used for matching (unless you're willing to ignore
problem input rows or mangle the data in surprising ways to avoid
such an error on a second unique index).  With a fully functional
MERGE syntax you could eventually gain the ability to write
exceptions like that to the location of your choice (be it a table
or WARNING messages in the log).

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Sep 29, 2014 at 2:20 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
> The claims that you can't get a duplicate key error with an UPSERT
> are completely bogus, IMV.  The *best* you can do is avoid them on
> the index used for matching (unless you're willing to ignore
> problem input rows or mangle the data in surprising ways to avoid
> such an error on a second unique index).

That's what I meant. Doing any more than that isn't useful. I want to
do exactly that - no more, no less.

If you're still not convinced, then I think the fact that no MERGE
implementation does what you want should be convincing. It is
*horrifically* complicated to make what you want work, if indeed it is
technically feasible at all. Isn't this already complicated enough?

We use different access techniques as you say. We do not use different
types of snapshots. That seems like a pretty fundamental distinction.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Kevin Grittner
Date:
Peter Geoghegan <pg@heroku.com> wrote:

> I think the fact that no MERGE implementation does what you want
> should be convincing. It is *horrifically* complicated to make
> what you want work, if indeed it is technically feasible at all.
> Isn't this already complicated enough?

What about the MERGE syntax I posted makes it hard to implement the
statement validation and execution code you already have?  (I'm
asking about for the UPSERT case only, not an implementation of all
aspects of the standard syntax.)

To recap, in summary that would be:
 MERGE INTO tablename [ alias ]   USING ( relation ) [ alias ]   ON ( boolean-expression )   WHEN MATCHED THEN
UPDATESET target-column = expression            [ , target-column = expression ] ...   WHEN NOT MATCHED THEN     INSERT
(target-columns ) VALUES ( expressions )
 

The initial implementation could restrict to these exact clauses
and require that the boolean-expression used equality-quals on all
columns of a unique index on only NOT NULL columns.  I think the
relation could be a VALUES clause or any SELECT statement without
causing problems; do you think that would need to be constrained in
some way?  It would be wonderful if the expressions could be any
arbitrary expressions assignable to the target columns; do you see
a need to constrain that?

If we later expand the MERGE statement to more general cases, I
don't see why statements of this form could not be treated as a
special case.  Personally, I'm dubious that we would want to
compromise transactional integrity to achieve the broader case, but
doubt that we would need to do so.  I won't say it is just a SMOP,
because there would need to be some careful design first.  ;-)

> We use different access techniques as you say. We do not use
> different types of snapshots. That seems like a pretty
> fundamental distinction.

We use special types of snapshots in running DML that fires certain
types of constraints, like FKs.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Andres Freund
Date:
On 2014-09-29 14:57:45 -0700, Kevin Grittner wrote:
> Peter Geoghegan <pg@heroku.com> wrote:
> 
> > I think the fact that no MERGE implementation does what you want
> > should be convincing. It is *horrifically* complicated to make
> > what you want work, if indeed it is technically feasible at all.
> > Isn't this already complicated enough?
> 
> What about the MERGE syntax I posted makes it hard to implement the
> statement validation and execution code you already have?  (I'm
> asking about for the UPSERT case only, not an implementation of all
> aspects of the standard syntax.)

> To recap, in summary that would be:
> 
>   MERGE INTO tablename [ alias ]
>     USING ( relation ) [ alias ]
>     ON ( boolean-expression )
>     WHEN MATCHED THEN
>       UPDATE SET target-column = expression
>              [ , target-column = expression ] ...
>     WHEN NOT MATCHED THEN
>       INSERT ( target-columns ) VALUES ( expressions )
> 
> The initial implementation could restrict to these exact clauses
> and require that the boolean-expression used equality-quals on all
> columns of a unique index on only NOT NULL columns.

That'll make it really hard to actually implement real MERGE.

Because suddenly there's no way for the user to know whether he's
written a ON condition that can implement UPSERT like properties
(i.e. the *precise* column list of an index) or not.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Sep 29, 2014 at 3:02 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> That'll make it really hard to actually implement real MERGE.
>
> Because suddenly there's no way for the user to know whether he's
> written a ON condition that can implement UPSERT like properties
> (i.e. the *precise* column list of an index) or not.

Exactly. The difficulty isn't doing what Kevin says so much as doing
so and then at a later date taking that thing and making it into a
fully featured MERGE. We'll be painted into a corner. That's bad,
because as I've said I think we need MERGE too (just far less
urgently).


-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Kevin Grittner
Date:
Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-09-29 14:57:45 -0700, Kevin Grittner wrote:

>> The initial implementation could restrict to these exact clauses
>> and require that the boolean-expression used equality-quals on all
>> columns of a unique index on only NOT NULL columns.
>
> That'll make it really hard to actually implement real MERGE.
>
> Because suddenly there's no way for the user to know whether he's
> written a ON condition that can implement UPSERT like properties
> (i.e. the *precise* column list of an index) or not.

Well, unless we abandon transactional semantics for other MERGE
statements, we should have a way that UPSERT logic continues to
work if you don't match a suitable index; it will just be slower --
potentially a lot slower, but that's what indexes are for.  I don't
think we need a separate statement type for the one we "do well",
because I don't think we should do the other one without proper
transactional semantics.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Andres Freund
Date:
On 2014-09-29 15:08:36 -0700, Kevin Grittner wrote:
> Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2014-09-29 14:57:45 -0700, Kevin Grittner wrote:
> 
> >> The initial implementation could restrict to these exact clauses
> >> and require that the boolean-expression used equality-quals on all
> >> columns of a unique index on only NOT NULL columns.
> >
> > That'll make it really hard to actually implement real MERGE.
> >
> > Because suddenly there's no way for the user to know whether he's
> > written a ON condition that can implement UPSERT like properties
> > (i.e. the *precise* column list of an index) or not.
> 
> Well, unless we abandon transactional semantics for other MERGE
> statements, we should have a way that UPSERT logic continues to
> work if you don't match a suitable index; it will just be slower --
> potentially a lot slower, but that's what indexes are for.  I don't
> think we need a separate statement type for the one we "do well",
> because I don't think we should do the other one without proper
> transactional semantics.

Wrong. You can't realistically implement the guarantees of UPSERT
without a corresponding UNIQUE index.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Kevin Grittner
Date:
Andres Freund <andres@2ndquadrant.com> wrote:

> Wrong. You can't realistically implement the guarantees of UPSERT
> without a corresponding UNIQUE index.

You definitely can do it; the question is what you consider
reasonable in terms of development effort, performance, and
concurrency.  I think the problem can be solved with non-scary
values of pretty much any two of those.  I guess my assumption is
that we won't handle the general case until someone wants to put
the substantial development effort into making the other two
acceptable.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Andres Freund
Date:
On 2014-09-29 15:16:49 -0700, Kevin Grittner wrote:
> Andres Freund <andres@2ndquadrant.com> wrote:
> 
> > Wrong. You can't realistically implement the guarantees of UPSERT
> > without a corresponding UNIQUE index.
> 
> You definitely can do it; the question is what you consider
> reasonable in terms of development effort, performance, and
> concurrency.

Right. You can exclusively lock the table and such. The point is just
that nobody wants that. I.e. people want to be warned about it.

> I think the problem can be solved with non-scary values of pretty much
> any two of those.  I guess my assumption is that we won't handle the
> general case until someone wants to put the substantial development
> effort into making the other two acceptable.

Which would be a major loss because MERGE is rather useful outside of
atomic upsert.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Sep 29, 2014 at 3:08 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
> Well, unless we abandon transactional semantics for other MERGE
> statements, we should have a way that UPSERT logic continues to
> work if you don't match a suitable index; it will just be slower --
> potentially a lot slower, but that's what indexes are for.

I want an implementation that doesn't have unique violations,
unprincipled deadlocks, or serialization failures at READ COMMITTED. I
want it because that's what  the majority of users actually want. It
requires no theoretical justification.

> I don't
> think we need a separate statement type for the one we "do well",
> because I don't think we should do the other one without proper
> transactional semantics.

That seems like a very impractical attitude. I cannot simulate what
I've been doing with unique indexes without taking an exclusive table
lock. That is a major footgun, so it isn't going to happen.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Craig Ringer
Date:
On 09/30/2014 01:59 AM, Peter Geoghegan wrote:
> On Mon, Sep 29, 2014 at 7:21 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> If you were an ORM developer reading the PostgreSQL Release Notes for
>> 9.5, which URL would you visit to see a complete description of the
>> new feature, including how it works concurrently, locking and other
>> aspects. How would you check whether some strange behaviour was a bug,
>> or intentional?
> 
> We don't do that with UPDATE, so why would we do it with this? There
> is an existing structure to the documentation that needs to be
> respected. 

I tend to agree, so long as there are appropriate cross-references.

See, for example, how window function information was added.

>This is the case even though the EvalPlanQual() mechanism
> is a total Postgres-ism, which can potentially violate snapshot
> isolation (this is not true of Oracle's READ COMMITTED, for example).

That's useful to know, and certainly worth covering in the isolation
portion of the docs.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Craig Ringer
Date:
On 09/30/2014 05:57 AM, Kevin Grittner wrote:
> Peter Geoghegan <pg@heroku.com> wrote:
> 
>> I think the fact that no MERGE implementation does what you want
>> should be convincing. It is *horrifically* complicated to make
>> what you want work, if indeed it is technically feasible at all.
>> Isn't this already complicated enough?
> 
> What about the MERGE syntax I posted makes it hard to implement the
> statement validation and execution code you already have?  (I'm
> asking about for the UPSERT case only, not an implementation of all
> aspects of the standard syntax.)

As I understand it, it isn't the syntax that's hard, it's the logic
behind it.

FWIW I'm pretty persuaded by the argument that:

* Other RDBMSes's MERGE implementations don't behave this way;

* MERGE is a join-based operation, it's not really the same as an upsert (though a join on a values-list is
similar-ish);

* Making MERGE work for the concurrency-safe upsert case would render it harder to then support the rest of MERGE for
theOLAP/data merging cases it's really specified for.
 

I also have a serious usability concern about re-purposing MERGE for
this. I think it'll be confusing to have a MERGE that's usable as a
concurrency-safe upsert and also as a non-concurrency-safe data merging
operation with slightly different options.

Borrowing from / closely following the MERGE syntax likely makes sense,
but special-casing a subset of MERGE would IMO be a regrettable
long-term decision.

> If we later expand the MERGE statement to more general cases, I
> don't see why statements of this form could not be treated as a
> special case.

Please, no.

That's basically having two different kinds of statement with subtly
different syntax differentiating them.

Upsert is full of confusing and subtle behaviour. Any implementation
needs to focus on making it easy to get right, and I don't think having
something where small syntax variations can cause you to silently trip
out of the concurrency-safe mode of operation would meet that need.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Craig Ringer
Date:
On 09/30/2014 06:08 AM, Kevin Grittner wrote:
> Well, unless we abandon transactional semantics for other MERGE
> statements, we should have a way that UPSERT logic continues to
> work if you don't match a suitable index; it will just be slower --
> potentially a lot slower, but that's what indexes are for.

That would probably lead to MERGE taking different lock strengths based
on index availability, having different failure modes, etc.

The less internal magic inside what's already a complicated and
confusing area for users, the better.

> I don't
> think we need a separate statement type for the one we "do well",
> because I don't think we should do the other one without proper
> transactional semantics.

"Proper transactional semantics" isn't the same as "free from all forms
of race condition".

Sometimes you want or need to do things that can't be made
concurrency-safe, or would perform unacceptably if done in a
concurrency-safe manner. That's why we have LOCK TABLE, among other things.

We have READ COMMITTED for a reason. We have SELECT without FOR SHARE
for a reason.

MERGE seems to be specified as more of an OLAP / ETL operation than an
OLTP one, and I think we should probably respect that - and the way
other RDBMSes have already implemented it.


-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Kevin Grittner
Date:
Peter Geoghegan <pg@heroku.com> wrote:
> On Mon, Sep 29, 2014 at 3:08 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
>> Well, unless we abandon transactional semantics for other MERGE
>> statements, we should have a way that UPSERT logic continues to
>> work if you don't match a suitable index; it will just be slower --
>> potentially a lot slower, but that's what indexes are for.
>
> I want an implementation that doesn't have unique violations,
> unprincipled deadlocks, or serialization failures at READ COMMITTED. I
> want it because that's what  the majority of users actually want. It
> requires no theoretical justification.

Sure.  I'm not suggesting otherwise.

>> I don't think we need a separate statement type for the one we
>> "do well", because I don't think we should do the other one
>> without proper transactional semantics.
>
> That seems like a very impractical attitude. I cannot simulate what
> I've been doing with unique indexes without taking an exclusive table
> lock. That is a major footgun, so it isn't going to happen.

There are certainly other ways to do it, although they require more
work.  As far as UPSERT goes, I agree that we should require such
an index, at least for the initial implementation and into the
foreseeable future.  What I'm saying is that if we implement it
using the standard MERGE syntax, then if the features of MERGE are
extended it will continue to work even in the absence of such an
index.  The index becomes a way of optimizing access rather than
defining what access is allowed.

At the risk of pushing people away from this POV, I'll point out
that this is somewhat similar to what we do for unlogged bulk loads
-- if all the conditions for doing it the fast way are present, we
do it the fast way; otherwise it still works, but slower.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 29 September 2014 18:59, Peter Geoghegan <pg@heroku.com> wrote:
> On Mon, Sep 29, 2014 at 7:21 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> If you were an ORM developer reading the PostgreSQL Release Notes for
>> 9.5, which URL would you visit to see a complete description of the
>> new feature, including how it works concurrently, locking and other
>> aspects. How would you check whether some strange behaviour was a bug,
>> or intentional?
>
> We don't do that with UPDATE, so why would we do it with this?

Because this is new, harder and non-standard, so there is no other
place to look. If you want to persuade us that MERGE has poorly
defined concurrency, so you have implemented a new command, the new
command had better have very well defined behaviour.

And because a reviewer asked for it?

For example, this patch for UPSERT doesn't support updatable views.
But I can't see anyone that didn't read the patch would know that.


>>> All of these were added. There are two new sets of isolation tests,
>>> one per variant of the new clause (IGNORE/UPDATE).
>>
>> When you say "added", what do you mean? You posted one new doc patch,
>> with no tests in it.
>
> I mean that there was a commit (not included with the documentation,
> but with the original patchset) with many tests. I don't know why
> you're suggesting that I don't have "concurrency tests". There are
> isolation tests in that commit. There are also many regression tests.

I see the tests in earlier patches; I was observing there are no new ones.

There are no tests for the use of CONFLICTING() syntax
No tests for interaction with triggers, with regard to before triggers
changing values prior to conflict detection.

My hope was that the complex behaviour of multiple unique indexes
might be explained there. Forgive me, I didn't see it.



>>>> No explanation of why the CONFLICTING() syntax differs from OLD./NEW.
>>>> syntax used in triggers
>>>
>>> Why should it be the same?
>>
>> Because it would be a principled approach to do that.
>
> That is just an assertion. The MERGE syntax doesn't use that either.

MERGE allows "AS row" which then allow you to refer to row.x for
column x of the input.

Other people have independently commented the same thing.


>> If we aren't going to use MERGE syntax, it would make sense to at
>> least use the same terminology.
>>
>> e.g.
>> INSERT ....
>> WHEN MATCHED
>> UPDATE
>>
>> The concept of "matched" is identical between MERGE and UPSERT and it
>> will be confusing to have two words for the same thing.
>
> I don't care if we change the spelling to "WHEN MATCHED
> UPDATE/IGNORE". That seems fine. But MERGE is talking about a join,
> not the presence of a would-be duplicate violation.

I don't understand that comment.

>> There seems to be a good reason not to use the MySQL syntax of ON
>> DUPLICATE KEY UPDATE, which doesn't allow you to specify UPDATE
>> operations other than a replace, so no deltas, e.g. SET a = a + x
>
> That isn't true, actually. It clearly does.

It does. Rather amusingly I misread the very unclear MySQL docs.


>> Having said that, it would be much nicer to have a mode that allows
>> you to just say the word "UPDATE" and have it copy the data into the
>> correct columns, like MySQL does. That is very intuitive, even if it
>> isn't very flexible.
>
> Multi-assignment updates (with or without CONFLICTING()) are supported, FWIW.

If I want the incoming row to overwrite the old row, it would be good
to have syntax to support that easily.

Why doesn't
INSERT INTO UNIQUE_TBL VALUES (1, 'a'), (2, 'b'), (2, 'b')  ON
CONFLICT UPDATE SET t = 'fails';
end up with this in the table?

1   a
2   fails

What happens with this?

BEGIN;
INSERT INTO UNIQUE_TBL VALUES (2, 'b')  ON CONFLICT UPDATE SET t = 'fails';
INSERT INTO UNIQUE_TBL VALUES (2, 'b')  ON CONFLICT UPDATE SET t = 'fails';
COMMIT;

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Tue, Sep 30, 2014 at 8:30 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>>>> No explanation of why the CONFLICTING() syntax differs from OLD./NEW.
>>>>> syntax used in triggers
>>>>
>>>> Why should it be the same?
>>>
>>> Because it would be a principled approach to do that.
>>
>> That is just an assertion. The MERGE syntax doesn't use that either.
>
> MERGE allows "AS row" which then allow you to refer to row.x for
> column x of the input.

It does, but that isn't what you suggested. You talked about the
OLD.*/NEW.* syntax.

>> I don't care if we change the spelling to "WHEN MATCHED
>> UPDATE/IGNORE". That seems fine. But MERGE is talking about a join,
>> not the presence of a would-be duplicate violation.
>
> I don't understand that comment.

I just mean that if you want to replace ON CONFLICT UPDATE with WHEN
MATCHED UPDATE - that little part of the grammar - that seems okay.

>> Multi-assignment updates (with or without CONFLICTING()) are supported, FWIW.
>
> If I want the incoming row to overwrite the old row, it would be good
> to have syntax to support that easily.

Well, maybe I'll get around to that when things settle down. That's
clearly in the realm of "nice to have", though.

> Why doesn't
> INSERT INTO UNIQUE_TBL VALUES (1, 'a'), (2, 'b'), (2, 'b')  ON
> CONFLICT UPDATE SET t = 'fails';
> end up with this in the table?
>
> 1   a
> 2   fails

A "cardinality violation" - just like MERGE. As with MERGE, the final
value of every row needs to be deterministic (within the command).

> What happens with this?
>
> BEGIN;
> INSERT INTO UNIQUE_TBL VALUES (2, 'b')  ON CONFLICT UPDATE SET t = 'fails';
> INSERT INTO UNIQUE_TBL VALUES (2, 'b')  ON CONFLICT UPDATE SET t = 'fails';
> COMMIT;

It works fine. No cardinality violation with two separate commands.
See the new ExecLockUpdateTuple() function within nodeModifyTable.c
for extensive discussion on how this is handled.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Tue, Sep 30, 2014 at 8:30 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 29 September 2014 18:59, Peter Geoghegan <pg@heroku.com> wrote:
>> On Mon, Sep 29, 2014 at 7:21 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> If you were an ORM developer reading the PostgreSQL Release Notes for
>>> 9.5, which URL would you visit to see a complete description of the
>>> new feature, including how it works concurrently, locking and other
>>> aspects. How would you check whether some strange behaviour was a bug,
>>> or intentional?
>>
>> We don't do that with UPDATE, so why would we do it with this?
>
> Because this is new, harder and non-standard, so there is no other
> place to look. If you want to persuade us that MERGE has poorly
> defined concurrency, so you have implemented a new command, the new
> command had better have very well defined behaviour.

I'm making a point about the structure of the docs here. The behavior
*is* documented, just not in the INSERT documentation, a situation
I've compare with how EvalPlanQual() isn't discussed in the
UPDATE/DELETE/SELECT FOR UPDATE docs. And EvalPlanQual() has some
pretty surprising corner-case behaviors.

That having been said, maybe I could have gone into more detail on the
"consensus among unique indexes" thing in another part of the
documentation, since that isn't separately covered (only the aspects
of when the predicate is evaluated in READ COMMITTED mode and other
things like that were covered).

> For example, this patch for UPSERT doesn't support updatable views.
> But I can't see anyone that didn't read the patch would know that.

By reading the CREATE VIEW docs. Maybe there could stand to be a
compatibility note in the main INSERT command, but I didn't want to do
that as long as things were up in the air. It might be the case that
we figure out good behavior for updatable views.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Josh Berkus
Date:
On 09/30/2014 11:20 AM, Peter Geoghegan wrote:
>> > For example, this patch for UPSERT doesn't support updatable views.
>> > But I can't see anyone that didn't read the patch would know that.
> By reading the CREATE VIEW docs. Maybe there could stand to be a
> compatibility note in the main INSERT command, but I didn't want to do
> that as long as things were up in the air. It might be the case that
> we figure out good behavior for updatable views.

All of these things sound like good ideas for documentation
improvements, but hardly anything which should block the patch.  It has
documentation, more than we'd require for a lot of other patches, and
it's not like the 9.5 release is next month.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Andres Freund
Date:
On 2014-09-30 11:49:21 -0700, Josh Berkus wrote:
> On 09/30/2014 11:20 AM, Peter Geoghegan wrote:
> >> > For example, this patch for UPSERT doesn't support updatable views.
> >> > But I can't see anyone that didn't read the patch would know that.
> > By reading the CREATE VIEW docs. Maybe there could stand to be a
> > compatibility note in the main INSERT command, but I didn't want to do
> > that as long as things were up in the air. It might be the case that
> > we figure out good behavior for updatable views.
> 
> All of these things sound like good ideas for documentation
> improvements, but hardly anything which should block the patch.  It has
> documentation, more than we'd require for a lot of other patches, and
> it's not like the 9.5 release is next month.

What's blocking it is that (afaik) no committer agrees with the approach
taken to solve the concurrency problems. And several (Heikki, Robert,
me) have stated their dislike of the proposed approach.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Josh Berkus
Date:
On 09/30/2014 11:51 AM, Andres Freund wrote:
>> All of these things sound like good ideas for documentation
>> > improvements, but hardly anything which should block the patch.  It has
>> > documentation, more than we'd require for a lot of other patches, and
>> > it's not like the 9.5 release is next month.
> What's blocking it is that (afaik) no committer agrees with the approach
> taken to solve the concurrency problems. And several (Heikki, Robert,
> me) have stated their dislike of the proposed approach.

If that's what's blocking it then fine.  But if we might change the
concurrency approach, then what's the point in quibbling about docs?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Josh Berkus
Date:
On 09/30/2014 07:15 AM, Kevin Grittner wrote:
> There are certainly other ways to do it, although they require more
> work.  As far as UPSERT goes, I agree that we should require such
> an index, at least for the initial implementation and into the
> foreseeable future.  What I'm saying is that if we implement it
> using the standard MERGE syntax, then if the features of MERGE are
> extended it will continue to work even in the absence of such an
> index.  The index becomes a way of optimizing access rather than
> defining what access is allowed.
> 
> At the risk of pushing people away from this POV, I'll point out
> that this is somewhat similar to what we do for unlogged bulk loads
> -- if all the conditions for doing it the fast way are present, we
> do it the fast way; otherwise it still works, but slower.

Except that switching between fast/slow bulk loads affects *only* the
speed of loading, not the locking rules.  Having a statement silently
take a full table lock when we were expecting it to be concurrent
(because, for example, the index got rebuilt and someone forgot the
UNIQUE) violates POLA from my perspective.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Tue, Sep 30, 2014 at 11:51 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> What's blocking it is that (afaik) no committer agrees with the approach
> taken to solve the concurrency problems. And several (Heikki, Robert,
> me) have stated their dislike of the proposed approach.

Well, it depends on what you mean by "approach to concurrency
problems". It's not as if a consensus has emerged in favor of another
approach, and if there is to be another approach, the details need to
be worked out ASAP. Even still, I would appreciate it if people could
review the patch on the assumption that those issues will be worked
out. After all, there are plenty of other parts to this that have
nothing to do with value locking - the entire "top half", which has
significant subtleties (some involving concurrency) in its own right,
reasonably well encapsulated from value locking. A couple of weeks
ago, I felt good about the fact that it seemed "time was on my side"
9.5-wise, but maybe that isn't true. Working through the community
process for this patch is going to be very difficult.

I think everyone understands that there could be several ways of
implementing value locking. I really do think it's a well encapsulated
aspect of the patch, though, so even if you hate how I've implemented
value locking, please try and give feedback on everything else. Simon
wanted to start with the user-visible semantics, which makes sense,
but I see no reason to limit it to that.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Andres Freund
Date:
On 2014-09-30 12:05:46 -0700, Peter Geoghegan wrote:
> On Tue, Sep 30, 2014 at 11:51 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> > What's blocking it is that (afaik) no committer agrees with the approach
> > taken to solve the concurrency problems. And several (Heikki, Robert,
> > me) have stated their dislike of the proposed approach.
> 
> Well, it depends on what you mean by "approach to concurrency
> problems". It's not as if a consensus has emerged in favor of another
> approach, and if there is to be another approach, the details need to
> be worked out ASAP.

Well. People have given you outlines of approaches. And Heikki even gave
you a somewhat working prototype. I don't think you can fairly expect
more.

> Even still, I would appreciate it if people could
> review the patch on the assumption that those issues will be worked
> out.

Right now I don't really see the point. You've so far shown no
inclination to accept significant concerns about your approach. And
without an agreement about how to solve the concurrency issues the
feature is dead in the water. And thus time spent reviewing isn't well
spent.

I'm pretty sure I'm not the only one feeling that way at this point.

> A couple of weeks
> ago, I felt good about the fact that it seemed "time was on my side"
> 9.5-wise, but maybe that isn't true. Working through the community
> process for this patch is going to be very difficult.

The community process involves accepting that your opinion isn't the
community's. Believe me, I learned that the hard way.

It's one thing to argue about the implementation of a feature for a week
or four. Or even insist that you're right in some implementation detail
local to your new code. But you've not moved one jota in the critical
parts that affect large parts of the system in half a year.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Kevin Grittner
Date:
Josh Berkus <josh@agliodbs.com> wrote:
> On 09/30/2014 07:15 AM, Kevin Grittner wrote:

>> At the risk of pushing people away from this POV, I'll point out
>> that this is somewhat similar to what we do for unlogged bulk loads
>> -- if all the conditions for doing it the fast way are present, we
>> do it the fast way; otherwise it still works, but slower.
>
> Except that switching between fast/slow bulk loads affects *only* the
> speed of loading, not the locking rules.  Having a statement silently
> take a full table lock when we were expecting it to be concurrent
> (because, for example, the index got rebuilt and someone forgot the
> UNIQUE) violates POLA from my perspective.

I would not think that an approach which took a full table lock to
implement the more general case would be accepted.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Josh Berkus
Date:
On 09/30/2014 02:39 PM, Kevin Grittner wrote:
> Josh Berkus <josh@agliodbs.com> wrote:
>> On 09/30/2014 07:15 AM, Kevin Grittner wrote:
> 
>>> At the risk of pushing people away from this POV, I'll point out
>>> that this is somewhat similar to what we do for unlogged bulk loads
>>> -- if all the conditions for doing it the fast way are present, we
>>> do it the fast way; otherwise it still works, but slower.
>>
>> Except that switching between fast/slow bulk loads affects *only* the
>> speed of loading, not the locking rules.  Having a statement silently
>> take a full table lock when we were expecting it to be concurrent
>> (because, for example, the index got rebuilt and someone forgot the
>> UNIQUE) violates POLA from my perspective.
> 
> I would not think that an approach which took a full table lock to
> implement the more general case would be accepted.

Why not?  There are certainly cases ... like bulk loading ... where
users would find it completely acceptable.  Imagine that you're merging
3 files into a single unlogged table before processing them into
finished data.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Kevin Grittner
Date:
Josh Berkus <josh@agliodbs.com> wrote:
> On 09/30/2014 02:39 PM, Kevin Grittner wrote:
>> Josh Berkus <josh@agliodbs.com> wrote:
>>> On 09/30/2014 07:15 AM, Kevin Grittner wrote:
>>>
>>>> At the risk of pushing people away from this POV, I'll point out
>>>> that this is somewhat similar to what we do for unlogged bulk loads
>>>> -- if all the conditions for doing it the fast way are present, we
>>>> do it the fast way; otherwise it still works, but slower.
>>>
>>> Except that switching between fast/slow bulk loads affects *only* the
>>> speed of loading, not the locking rules.  Having a statement silently
>>> take a full table lock when we were expecting it to be concurrent
>>> (because, for example, the index got rebuilt and someone forgot the
>>> UNIQUE) violates POLA from my perspective.
>>
>> I would not think that an approach which took a full table lock to
>> implement the more general case would be accepted.
>
> Why not?  There are certainly cases ... like bulk loading ... where
> users would find it completely acceptable.  Imagine that you're merging
> 3 files into a single unlogged table before processing them into
> finished data.

So the expectation is that when we implement MERGE it will, by
default, take out an EXCLUSIVE lock for the entire target table for
the entire duration of the command?  I would have expected a bit
more finesse.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Josh Berkus
Date:
On 09/30/2014 02:51 PM, Kevin Grittner wrote:
> Josh Berkus <josh@agliodbs.com> wrote:
>> On 09/30/2014 02:39 PM, Kevin Grittner wrote:
>>> Josh Berkus <josh@agliodbs.com> wrote:
>>>> On 09/30/2014 07:15 AM, Kevin Grittner wrote:
>>>>
>>>>> At the risk of pushing people away from this POV, I'll point out
>>>>> that this is somewhat similar to what we do for unlogged bulk loads
>>>>> -- if all the conditions for doing it the fast way are present, we
>>>>> do it the fast way; otherwise it still works, but slower.
>>>>
>>>> Except that switching between fast/slow bulk loads affects *only* the
>>>> speed of loading, not the locking rules.  Having a statement silently
>>>> take a full table lock when we were expecting it to be concurrent
>>>> (because, for example, the index got rebuilt and someone forgot the
>>>> UNIQUE) violates POLA from my perspective.
>>>
>>> I would not think that an approach which took a full table lock to
>>> implement the more general case would be accepted.
>>
>> Why not?  There are certainly cases ... like bulk loading ... where
>> users would find it completely acceptable.  Imagine that you're merging
>> 3 files into a single unlogged table before processing them into
>> finished data.
> 
> So the expectation is that when we implement MERGE it will, by
> default, take out an EXCLUSIVE lock for the entire target table for
> the entire duration of the command?  I would have expected a bit
> more finesse.

I don't know that that is the *expectation*.  However, I personally
would find it *acceptable* if it meant that we could get efficient merge
semantics on other aspects of the syntax, since my primary use for MERGE
is bulk loading.

Regardless, I don't think there's any theoretical way to support UPSERT
without a unique constraint.  Therefore eventual support of this would
require a full table lock.  Therefore having it use the same command as
UPSERT with a unique constraint is a bit of a booby trap for users.
This is a lot like the "ADD COLUMN with a default rewrites the whole
table" booby trap which hundreds of our users complain about every
month.  We don't want to add more such unexpected consequences for users.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Andres Freund
Date:
On 2014-09-30 14:51:57 -0700, Kevin Grittner wrote:
> Josh Berkus <josh@agliodbs.com> wrote:
> > On 09/30/2014 02:39 PM, Kevin Grittner wrote:
> >> Josh Berkus <josh@agliodbs.com> wrote:
> >>> On 09/30/2014 07:15 AM, Kevin Grittner wrote:
> >>>
> >>>> At the risk of pushing people away from this POV, I'll point out
> >>>> that this is somewhat similar to what we do for unlogged bulk loads
> >>>> -- if all the conditions for doing it the fast way are present, we
> >>>> do it the fast way; otherwise it still works, but slower.
> >>>
> >>> Except that switching between fast/slow bulk loads affects *only* the
> >>> speed of loading, not the locking rules.  Having a statement silently
> >>> take a full table lock when we were expecting it to be concurrent
> >>> (because, for example, the index got rebuilt and someone forgot the
> >>> UNIQUE) violates POLA from my perspective.
> >>
> >> I would not think that an approach which took a full table lock to
> >> implement the more general case would be accepted.
> >
> > Why not?  There are certainly cases ... like bulk loading ... where
> > users would find it completely acceptable.  Imagine that you're merging
> > 3 files into a single unlogged table before processing them into
> > finished data.
> 
> So the expectation is that when we implement MERGE it will, by
> default, take out an EXCLUSIVE lock for the entire target table for
> the entire duration of the command?  I would have expected a bit
> more finesse.

I think it'd be acceptable. Alternatively we'll just accept that you can
get uniqueness violations under concurrency. I many cases that'll be
fine.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Tue, Sep 30, 2014 at 3:01 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> I think it'd be acceptable. Alternatively we'll just accept that you can
> get uniqueness violations under concurrency. I many cases that'll be
> fine.


I think living with unique violations is the right thing with MERGE, fwiw.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Andres Freund
Date:
On 2014-09-30 14:57:43 -0700, Josh Berkus wrote:
> Regardless, I don't think there's any theoretical way to support UPSERT
> without a unique constraint.

You can do stuff like blocking predicate locking. But without indexes to
support it that gets awfully complicated and unfunny. I don't think we
want to go there. So essentially I agree with that statement.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Kevin Grittner
Date:
Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-09-30 14:57:43 -0700, Josh Berkus wrote:
>
>> Regardless, I don't think there's any theoretical way to support
>> UPSERT without a unique constraint.
>
> You can do stuff like blocking predicate locking. But without indexes to
> support it that gets awfully complicated and unfunny. I don't think we
> want to go there. So essentially I agree with that statement.

Well, as you seem to be saying, it's not to bad with even an
non-unique index if we wanted to do a little extra work; and there
are a lot of ways to potentially deal with it even without that.
Theoretically, the number of ways to do this is limited only by
time available to brainstorm.

That said, at no time have I advocated that we try to implement
UPSERT in this release with anything but a UNIQUE index.  The issue
I raised was whether a subset of the MERGE syntax should be used to
specify UPSERT rather than inventing our own syntax -- which
doesn't seem in any way incompatible requiring a unique index to
match the expression.  Given subsequent discussion, perhaps we
could decorate it with something to indicate which manner of
concurrency handling is desired?  Techniques discussed so far are
- UPSERT style- Hold an EXCLUSIVE lock on the table- Allow "native" concurrency management

An alternative which seems to be on some people's minds is to use a
different command name for the first option (but why not keep the
rest of the standard syntax?) and to require an explicit LOCK TABLE
statement at the start of the transaction if you want the second
option.

My preference, after this discussion, would be to default to UPSERT
style if the appropriate conditions are met, and to default to the
third option otherwise.  If you want an exclusive lock, ask for it
with the LOCK TABLE statement.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Tue, Sep 30, 2014 at 2:15 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> Well. People have given you outlines of approaches. And Heikki even gave
> you a somewhat working prototype. I don't think you can fairly expect
> more.

I don't expect anything, really. I asked nicely - that's all. I don't
know why there is so much discussion of what I expect or don't expect.
Things don't work around here by everyone doing only strictly what
they're obligated to do. Everyone is strictly obligated to do nothing,
when you get right down to it.

>> Even still, I would appreciate it if people could
>> review the patch on the assumption that those issues will be worked
>> out.
>
> Right now I don't really see the point. You've so far shown no
> inclination to accept significant concerns about your approach. And
> without an agreement about how to solve the concurrency issues the
> feature is dead in the water. And thus time spent reviewing isn't well
> spent.
>
> I'm pretty sure I'm not the only one feeling that way at this point.

I think that's *incredibly* unfair. There appears to be broad
acceptance of the problems around deadlocking as a result of my work
with Heikki. That was a major step forward. Now we all agree on the
parameters of the discussion around value locking, AFAICT. There is an
actual way forward, and not total quagmire -- great. I had to dig my
heals in to win that much, and it wasn't easy. I accept that it
probably wasn't easy for other people either, and I am thankful for
the effort of other people, particularly Heikki, but also you.

>> A couple of weeks
>> ago, I felt good about the fact that it seemed "time was on my side"
>> 9.5-wise, but maybe that isn't true. Working through the community
>> process for this patch is going to be very difficult.
>
> The community process involves accepting that your opinion isn't the
> community's. Believe me, I learned that the hard way.

The community doesn't have a worked-out opinion on this either way.
Arguably, what you and Simon want to do is closer than what I want to
do than what Heikki wants to do - you're still talking about adding
locks that are tied to AMs in a fairly fundamental way. But, FWIW, I'd
sooner take Heikki's approach than insert promise tuples into indexes
directly. I think that Heikki's approach is better.

In all honesty, I don't care who "wins", as long as someone does and
we get the feature in shape. No one can "win" if all sides are not
realistic about the problems. The issues that I've called out about
what Heikki has suggested are quite significant issues. Can't we talk
about them? Or am I required to polish-up Heikki's approach, and
present it at a commitfest, only to have somebody point out the same
issues then? I am *not* nitpicking, and the issues are of fundamental
importance. Look at the issues I raise and you'll see that's the case.

My pointing out of these issues is not some artifice to "win" the
argument. I don't appreciate the insinuation that it is. I am
completely undeserving of that sort of mistrust. It's insulting. And
it's also a total misrepresentation to suggest it's me versus you,
Heikki, Robert, and Simon. Opinion is far more divided than you let
on, since what you and Simon suggest is far different to what Heikki
suggests. Let's figure out a way to reach agreement.

> It's one thing to argue about the implementation of a feature for a week
> or four. Or even insist that you're right in some implementation detail
> local to your new code. But you've not moved one jota in the critical
> parts that affect large parts of the system in half a year.

You're right. I haven't moved one bit on that. But, on the other hand,
I haven't doubled down on the approach either - I have done very
little on it, and have given it relatively little thought either way.
I preferred to focus my energies on the "top half". Surely you'd agree
that that was the logical course of action to take over the last few
months. I don't know if you noticed, but I presented this whole new
revised version as "this is the thing that gives us the ability to
discuss the fundamental issue of value locking". So my suggestion was
that if you don't want to have that conversation, at least look at the
"top half" a bit.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 30 September 2014 19:49, Josh Berkus <josh@agliodbs.com> wrote:
> On 09/30/2014 11:20 AM, Peter Geoghegan wrote:
>>> > For example, this patch for UPSERT doesn't support updatable views.
>>> > But I can't see anyone that didn't read the patch would know that.
>> By reading the CREATE VIEW docs. Maybe there could stand to be a
>> compatibility note in the main INSERT command, but I didn't want to do
>> that as long as things were up in the air. It might be the case that
>> we figure out good behavior for updatable views.
>
> All of these things sound like good ideas for documentation
> improvements, but hardly anything which should block the patch.  It has
> documentation, more than we'd require for a lot of other patches, and
> it's not like the 9.5 release is next month.

We won't get consensus simply by saying "Would you like a fast upsert
feature?" because everyone says Yes to that.

A clear description of the feature being added is necessary to agree
its acceptance. When we implement a SQL Standard feature, we can just
look in the standard to see how it should work and compare. When we go
off-piste, we need more info to make sure we know what we are getting
as well as why we are not getting something from the Standard.

I have not suggested I would block the patch because it doesn't have
docs. I have pointed out that the lack of consensus about the patch is
because nobody knows what it contains, which others agreed with. My
request was, and is, a proposed mechanism to *unblock* a very
obviously stalled patch.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Tue, Sep 30, 2014 at 4:28 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> A clear description of the feature being added is necessary to agree
> its acceptance. When we implement a SQL Standard feature, we can just
> look in the standard to see how it should work and compare. When we go
> off-piste, we need more info to make sure we know what we are getting
> as well as why we are not getting something from the Standard.

I think that's fair.

> I have not suggested I would block the patch because it doesn't have
> docs. I have pointed out that the lack of consensus about the patch is
> because nobody knows what it contains, which others agreed with. My
> request was, and is, a proposed mechanism to *unblock* a very
> obviously stalled patch.

Please keep asking questions - it isn't necessarily obvious to me
*what* isn't clear, because of my lack of perspective. That's a useful
role. It occurs to me now that I ought to have found a place to
document "cardinality violations" [1], but I didn't, for example.

[1] http://tracker.firebirdsql.org/browse/CORE-2274
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Andres Freund
Date:
On 2014-09-26 16:19:33 -0700, Peter Geoghegan wrote:
> On Fri, Sep 26, 2014 at 3:25 PM, Peter Geoghegan <pg@heroku.com> wrote:
> > On Fri, Sep 26, 2014 at 3:11 PM, Alvaro Herrera
> > <alvherre@2ndquadrant.com> wrote:
> >> FWIW there are 28 callers of HeapTupleHeaderGetXmin.
> 
> > Don't forget about direct callers to HeapTupleHeaderGetRawXmin(),
> > though. There are plenty of those in tqual.c.
> 
> Which reminds me: commit 37484ad2 added the opportunistic freezing
> stuff. To quote the commit message:
> 
> """
> Instead of changing the tuple xmin to FrozenTransactionId, the combination
> of HEAP_XMIN_COMMITTED and HEAP_XMIN_INVALID, which were previously never
> set together, is now defined as HEAP_XMIN_FROZEN.  A variety of previous
> proposals to freeze tuples opportunistically before vacuum_freeze_min_age
> is reached have foundered on the objection that replacing xmin by
> FrozenTransactionId might hinder debugging efforts when things in this
> area go awry; this patch is intended to solve that problem by keeping
> the XID around (but largely ignoring the value to which it is set).
> 
> """
> 
> Why wouldn't the same objection (the objection that the earlier
> opportunistic freezing ideas stalled on) apply to directly setting
> tuple xmin to InvalidTransactionId?

Because it's pretty much unrelated? The FrozenTransactionId bit you
reference is about tuples that actually survive, which isn't the case
here.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Bruce Momjian
Date:
On Tue, Sep 30, 2014 at 02:57:43PM -0700, Josh Berkus wrote:
> I don't know that that is the *expectation*.  However, I personally
> would find it *acceptable* if it meant that we could get efficient merge
> semantics on other aspects of the syntax, since my primary use for MERGE
> is bulk loading.
> 
> Regardless, I don't think there's any theoretical way to support UPSERT
> without a unique constraint.  Therefore eventual support of this would
> require a full table lock.  Therefore having it use the same command as
> UPSERT with a unique constraint is a bit of a booby trap for users.
> This is a lot like the "ADD COLUMN with a default rewrites the whole
> table" booby trap which hundreds of our users complain about every
> month.  We don't want to add more such unexpected consequences for users.

I think if we use the MERGE command for this feature we would need to
use a non-standard keyword to specify that we want OLTP/UPSERT
functionality.  That would allow us to mostly use the MERGE standard
syntax without having surprises about non-standard behavior.  I am
thinking of how CONCURRENTLY changes the behavior of some commands.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Oct 2, 2014 at 1:10 PM, Bruce Momjian <bruce@momjian.us> wrote:
> I think if we use the MERGE command for this feature we would need to
> use a non-standard keyword to specify that we want OLTP/UPSERT
> functionality.  That would allow us to mostly use the MERGE standard
> syntax without having surprises about non-standard behavior.  I am
> thinking of how CONCURRENTLY changes the behavior of some commands.

That would leave you without a real general syntax. It'd also make
having certain aspects of an UPSERT more explicit be a harder goal
(there is no conventional join involved here - everything goes through
a unique index). Adding the magic keyword would break certain other
parts of the statement, so you'd have exact rules for what worked
where. I see no advantage, and considerable disadvantages.

Note that I've documented a lot of this stuff here:

https://wiki.postgresql.org/wiki/UPSERT

Mapping the join thing onto which unique index you want to make the
UPSERT target is very messy. There are a lot of corner cases. It's
quite ticklish.

Please add to it if you think we've missed something.
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Bruce Momjian
Date:
On Thu, Oct  2, 2014 at 02:08:30PM -0700, Peter Geoghegan wrote:
> On Thu, Oct 2, 2014 at 1:10 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > I think if we use the MERGE command for this feature we would need to
> > use a non-standard keyword to specify that we want OLTP/UPSERT
> > functionality.  That would allow us to mostly use the MERGE standard
> > syntax without having surprises about non-standard behavior.  I am
> > thinking of how CONCURRENTLY changes the behavior of some commands.
> 
> That would leave you without a real general syntax. It'd also make
> having certain aspects of an UPSERT more explicit be a harder goal
> (there is no conventional join involved here - everything goes through
> a unique index). Adding the magic keyword would break certain other
> parts of the statement, so you'd have exact rules for what worked
> where. I see no advantage, and considerable disadvantages.
> 
> Note that I've documented a lot of this stuff here:
> 
> https://wiki.postgresql.org/wiki/UPSERT
> 
> Mapping the join thing onto which unique index you want to make the
> UPSERT target is very messy. There are a lot of corner cases. It's
> quite ticklish.
> 
> Please add to it if you think we've missed something.

OK, it is was just an idea I wanted to point out, and if it doesn't
work, it more clearly cements that we need UPSERT _and_ MERGE.

Josh was pointing out that we don't want to surprise our users, so I
suggested an additional keyword, which addresses his objections, but as
you said, if that standard MERGE syntax doesn't give us what we want,
then that is the fatal objection to using only MERGE.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Marti Raudsepp
Date:
On Thu, Sep 4, 2014 at 12:13 AM, Peter Geoghegan <pg@heroku.com> wrote:
> On Wed, Sep 3, 2014 at 9:51 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> INSERT INTO upsert(key, val) VALUES(1, 'insert') ON CONFLICT WITHIN
>>> upsert_pkey UPDATE SET val = 'update';
>>
>> It seems to me that it would be better to specify a conflicting column
>> set rather than a conflicting index name.
>
> I'm open to pursuing that, provided there is a possible implementation
> that's robust against things like BEFORE triggers that modify
> constrained attributes. It must also work well with partial unique
> indexes. So I imagine we'd have to determine a way of looking up the
> unique index only after BEFORE triggers fire. Unless you're
> comfortable with punting on some of these cases by throwing an error,
> then all of this is actually surprisingly ticklish.

Speaking of this, I really don't like the proposed behavior of firing
BEFORE INSERT triggers even before we've decided whether to insert or
update. In the "classical" upsert pattern, changes by a BEFORE INSERT
trigger would get rolled back on conflict, but the new approach seems
surprising: changes from BEFORE INSERT get persisted in the database,
but AFTER INSERT is not fired.

I haven't found any discussion about alternative triggers semantics
for upsert. If there has been any, can you point me to it?

----
How about this: use the original VALUES results for acquiring a value
lock; if indeed the row didn't conflict, *then* fire BEFORE INSERT
triggers, and throw an error if the trigger changed any columns of the
(specified?) unique key.

Advantages of this approach:
1. Would solve the above conundrum about specifying a unique index via columns.
2. In the UPDATE case we can skip evaluating INSERT triggers and
DEFAULT expressions for columns
3. If I'm not missing anything, this approach may also let us get rid
of the CONFLICTING() construct
4. Possibly be closer to MySQL's syntax?

Point (2) is actually a major consideration IMO: if your query is
mostly performing UPDATEs, on a table with SERIAL keys, and you're
using a different key to perform the updates, then you waste sequence
values unnecessarily. I believe this is a very common pattern, for
example:

create table evt_type (id serial primary key, name text unique, evt_count int);
prepare upsert(text) as INSERT into evt_type (name, evt_count) values ($1, 1)
on conflict within evt_type_name_key UPDATE set evt_count=evt_count+1;

execute upsert('foo');
execute upsert('foo');
execute upsert('bar');

# table evt_type;id | name | evt_count
----+------+----------- 1 | foo  |         2 3 | bar  |         1   <-- id could very well be "2"

Regards,
Marti



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 2 October 2014 22:37, Bruce Momjian <bruce@momjian.us> wrote:

> OK, it is was just an idea I wanted to point out, and if it doesn't
> work, it more clearly cements that we need UPSERT _and_ MERGE.

It seems clear that having two different initial keywords is popular
because it provides clarity about which aspects of the commands will
be supported.

I like the idea of making the two commands as close as possible in
syntax, which will make it easier to program for and encourage
adoption.
The command name could easily be MERGE [CONCURRENTLY] since that uses
the same concept from earlier DDL syntax/keywords.

In UPSERT, we don't need the ON keyword at all. If we are altering the
syntax, then we can easily remove this.

IIRC it wasn't agreed that we needed to identify which indexes in the
upsert SQL statement itself, since this would be possible in other
ways and would require programmers to know which unique constraints
are declared.

All of the other syntax could easily remain the same, leaving us with
a command that looks like this...

MERGE CONCURRENTLY INTO foo USING VALUES ()
WHEN NOT MATCHED THEN INSERT
WHEN MATCHED THENUPDATE

Since MERGE now supports DELETE and IGNORE as options, presumably we
would also want to support those for the UPSERT version also.
I think it would be useful to also support a mechanism for raising an
error, as DB2 allows.

More complex example of MERGE

MERGE INTO product AS T     USING (SELECT sales.id, sum(sold) AS sold, max(catalog.name) as name    FROM sales, catalog
WHEREsales.id = catalog.id GROUP BY sales.id) AS S       ON S.id = T.id WHEN MATCHED AND T.inventory = S.sold
  THEN DELETE WHEN MATCHED AND T.inventory < S.sold              THEN SIGNAL SQLSTATE '78000' SET MESSAGE_TEXT =
 
'Oversold: ' || S.name WHEN MATCHED              THEN UPDATE SET inventory = T.inventory - S.sold WHEN NOT MATCHED
      THEN INSERT VALUES(S.id, S.name, -S.sold);
 

Full example would be similar to this

MERGE CONCURRENTLY INTO product AS T     USING VALUES () WHEN MATCHED AND T.inventory = S.sold              THEN DELETE
WHENMATCHED AND T.inventory < S.sold              THEN SIGNAL SQLSTATE '78000' SET MESSAGE_TEXT =
 
'Oversold: ' || S.name WHEN MATCHED              THEN UPDATE SET inventory = T.inventory - S.sold WHEN NOT MATCHED
      THEN INSERT VALUES(S.id, S.name, -S.sold);
 

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Tue, Oct 7, 2014 at 5:23 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> IIRC it wasn't agreed that we needed to identify which indexes in the
> upsert SQL statement itself, since this would be possible in other
> ways and would require programmers to know which unique constraints
> are declared.

Kevin seemed quite concerned about that. That is something that seems
hard to reconcile with supporting the MERGE syntax. Perhaps Kevin can
comment on that, since he was in favor of both being able to specify
user intent by accepting a unique index, while also being in favor of
the MERGE syntax.

> All of the other syntax could easily remain the same, leaving us with
> a command that looks like this...
>
> MERGE CONCURRENTLY INTO foo USING VALUES ()
> WHEN NOT MATCHED THEN
>   INSERT
> WHEN MATCHED THEN
>  UPDATE
>
> Since MERGE now supports DELETE and IGNORE as options, presumably we
> would also want to support those for the UPSERT version also.
> I think it would be useful to also support a mechanism for raising an
> error, as DB2 allows.

It seems like what you're talking about here is just changing the
spelling of what I already have. I think that would be confusing to
users when the time comes to actually implement a fully-generalized
MERGE, even with the clearly distinct MERGE CONCURRENTLY variant
outlined here (which, of course, lacks an outer join, unlike MERGE
proper).

However, unlike the idea of trying to square the circle of producing a
general purpose MERGE command that also supports the UPSERT use-case,
my objection to this much more limited proposal is made purely on
aesthetic grounds. I think that it is not very user-friendly; I do not
think that it's a total disaster, which is what trying to solve both
problems at once (MERGE bulkloading and UPSERTing) would result in. So
FWIW, if the community is really set on something that includes the
keyword MERGE, which is really all you outline here, then I can live
with that.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Marti Raudsepp
Date:
On Wed, Oct 8, 2014 at 3:47 AM, Peter Geoghegan <pg@heroku.com> wrote:
> It seems like what you're talking about here is just changing the
> spelling of what I already have.

I think there's a subtle difference in expectations too. The current
BEFORE INSERT trigger behavior is somewhat defensible with an
INSERT-driven syntax (though I don't like it even now [1]). But the
MERGE syntax, to me, strongly implies that insertion doesn't begin
before determining whether a conflict exists or not.

[1] http://www.postgresql.org/message-id/CABRT9RD6zriK+t6mnqQOqaozZ5z1bUaKh+kNY=O9ZqBZFoAuBg@mail.gmail.com

Regards,
Marti



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Wed, Oct 8, 2014 at 1:36 AM, Marti Raudsepp <marti@juffo.org> wrote:
> I think there's a subtle difference in expectations too. The current
> BEFORE INSERT trigger behavior is somewhat defensible with an
> INSERT-driven syntax (though I don't like it even now [1]).

There is no way around it. We need to fire before row triggers to know
what to insert on the one hand, but on the other hand (in general) we
have zero ability to nullify the effects (or side-effects) of before
triggers, since they may execute arbitrary user-defined code. I think
there is a good case to be made for severely restricting what before
row triggers can do, but it's too late for that.

> But the
> MERGE syntax, to me, strongly implies that insertion doesn't begin
> before determining whether a conflict exists or not.

I think you're right. Another strike against the MERGE syntax, then,
since as I said we cannot even know what to check prior to having
before row insert triggers fire.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Marti Raudsepp
Date:
On Tue, Oct 7, 2014 at 2:27 PM, Marti Raudsepp <marti@juffo.org> wrote:
> but the new approach seems
> surprising: changes from BEFORE INSERT get persisted in the database,
> but AFTER INSERT is not fired.

I am sorry, I realize now that I misunderstood the current proposed
trigger behavior, I thought what Simon Riggs wrote here already
happens:
https://groups.google.com/forum/#!msg/django-developers/hdzkoLYVjBY/bnXyBVqx95EJ

But the point still stands: firing INSERT triggers when the UPDATE
path is taken is counterintuitive.  If we prevent changes of upsert
key columns in BEFORE triggers then we get the benefits, including
more straightforward trigger behavior and avoid problems with serial
columns.

Regards,
Marti



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Marti Raudsepp
Date:
On Wed, Oct 8, 2014 at 12:28 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Wed, Oct 8, 2014 at 1:36 AM, Marti Raudsepp <marti@juffo.org> wrote:
>> I think there's a subtle difference in expectations too. The current
>> BEFORE INSERT trigger behavior is somewhat defensible with an
>> INSERT-driven syntax (though I don't like it even now [1]).
>
> There is no way around it. We need to fire before row triggers to know
> what to insert on the one hand, but on the other hand (in general) we
> have zero ability to nullify the effects (or side-effects) of before
> triggers, since they may execute arbitrary user-defined code.

With my proposal this problem disappears: if we prevent BEFORE
triggers from changing key attributes of NEW in the case of upsert,
then we can acquire value locks before firing any triggers (before
even constructing the whole tuple), and have a guarantee that the
value locks are still valid by the time we proceed with the actual
insert/update.

Other than changing NEW, the side effects of triggers are not relevant.

Now, there may very well be reasons why this is tricky to implement,
but I haven't heard any. Can you see any concrete reasons why this
won't work? I can take a shot at implementing this, if you're willing
to consider it.

> I think
> there is a good case to be made for severely restricting what before
> row triggers can do, but it's too late for that.

There are no users of new "upsert" syntax out there yet, so it's not
too late to rehash the semantics of that. This in no way affects users
of old INSERT/UPDATE syntax.

Regards,
Marti



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 8 October 2014 01:47, Peter Geoghegan <pg@heroku.com> wrote:

> It seems like what you're talking about here is just changing the
> spelling of what I already have. I think that would be confusing to
> users when the time comes to actually implement a fully-generalized
> MERGE, even with the clearly distinct MERGE CONCURRENTLY variant
> outlined here (which, of course, lacks an outer join, unlike MERGE
> proper).

I change my view on this, after some more thought. (Hope that helps)

If we implement MERGE, I can see we may also wish to implement MERGE
CONCURRENTLY one day. That would be different to UPSERT.

So in the future I think we will need 3 commands

1. MERGE
2. MERGE CONCURRENTLY
3. UPSERT

So we no longer need to have the command start with the MERGE keyword.

> However, unlike the idea of trying to square the circle of producing a
> general purpose MERGE command that also supports the UPSERT use-case,
> my objection to this much more limited proposal is made purely on
> aesthetic grounds. I think that it is not very user-friendly; I do not
> think that it's a total disaster, which is what trying to solve both
> problems at once (MERGE bulkloading and UPSERTing) would result in. So
> FWIW, if the community is really set on something that includes the
> keyword MERGE, which is really all you outline here, then I can live
> with that.

We will one day have MERGE according to the SQL Standard.

My opinion is that syntax for this should be similar to MERGE in the
*body* of the command, rather than some completely different syntax.
e.g.

> WHEN NOT MATCHED THEN
>   INSERT
> WHEN MATCHED THEN
>  UPDATE

I'm happy that we put that to a vote on what the syntax should be, as
long as we bear in mind that we will one day have MERGE as well.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Sep 29, 2014 at 7:21 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Having said that, it would be much nicer to have a mode that allows
> you to just say the word "UPDATE" and have it copy the data into the
> correct columns, like MySQL does. That is very intuitive, even if it
> isn't very flexible.

I thought about this, and at first I agreed, but now I'm not so sure.

Consider the case where you write an INSERT ... ON CONFLICT UPDATE ALL
query, or however we might spell this idea.

1. Developer writes the query, and it works fine.

2. Some time later, the DBA adds an inserted_at column (those are
common). The DBA is not aware of the existence of this particular
query. The new column has a default value of now(), say.

Should we UPDATE the inserted_at column to be NULL? Or (more
plausibly) the default value filled in by the INSERT? Or leave it be?
I think there is a case to be made for all of these behaviors, and
that tension makes me prefer to not do this at all. It's like
encouraging "SELECT *" queries in production, only worse.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Wed, Oct 8, 2014 at 12:12 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Mon, Sep 29, 2014 at 7:21 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Having said that, it would be much nicer to have a mode that allows
>> you to just say the word "UPDATE" and have it copy the data into the
>> correct columns, like MySQL does. That is very intuitive, even if it
>> isn't very flexible.
>
> I thought about this, and at first I agreed, but now I'm not so sure.

Actually, I don't think MySQL supports this. It doesn't allow INSERT
ON DUPLICATE KEY UPDATE to do it, AFAICT. Their REPLACE syntax
supports that, but that's a feature that is quite distinct to what I
have in mind here.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Wed, Oct 8, 2014 at 6:25 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> I change my view on this, after some more thought. (Hope that helps)

Great.

> If we implement MERGE, I can see we may also wish to implement MERGE
> CONCURRENTLY one day. That would be different to UPSERT.
>
> So in the future I think we will need 3 commands
>
> 1. MERGE
> 2. MERGE CONCURRENTLY
> 3. UPSERT
>
> So we no longer need to have the command start with the MERGE keyword.

As I've outlined, I don't see how MERGE CONCURRENTLY could ever work,
but I'm glad that you agree that it should not block this effort (or
indeed, some later effort to implement a MERGE that is comparable to
the implementations of other database systems).

> We will one day have MERGE according to the SQL Standard.

Agreed.

> My opinion is that syntax for this should be similar to MERGE in the
> *body* of the command, rather than some completely different syntax.
> e.g.
>
>> WHEN NOT MATCHED THEN
>>   INSERT
>> WHEN MATCHED THEN
>>  UPDATE
>
> I'm happy that we put that to a vote on what the syntax should be, as
> long as we bear in mind that we will one day have MERGE as well.

While I am also happy with taking a vote, if we do so I vote against
even this much less MERGE-like syntax. It's verbose, and makes much
less sense when the mechanism is driven by would-be duplicate key
violations rather than an outer join. I also like that when you UPSERT
with the proposed ON CONFLICT UPDATE syntax, you get all the
flexibility of an INSERT - you can use data-modifying CTEs, and nest
the statement in a data-modifying CTE, and "INSERT ... SELECT... ON
CONFLICT UPDATE ..." . And to be honest, it's much simpler to
implement this whole feature as an adjunct to how INSERT statements
are currently processed (during parse analysis, planning and
execution); I don't want to make the syntax work against that. For
example, consider how little I had to change the grammar to make all
of this work:

$ git diff master --stat | grep gramsrc/backend/parser/gram.y                          |  72 ++-

The code footprint of this patch is relatively small, and I think we
can all agree that that's a good thing.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Petr Jelinek
Date:
On 08/10/14 11:28, Peter Geoghegan wrote:
> On Wed, Oct 8, 2014 at 1:36 AM, Marti Raudsepp <marti@juffo.org> wrote:
>> But the
>> MERGE syntax, to me, strongly implies that insertion doesn't begin
>> before determining whether a conflict exists or not.
>
> I think you're right. Another strike against the MERGE syntax, then,
> since as I said we cannot even know what to check prior to having
> before row insert triggers fire.
>

True, but to me it also seems to be strike against using INSERT for this 
as I don't really see how you can make triggers work in a sane way if 
the UPSERT is implemented as part of INSERT (at least I haven't seen any 
proposal that I would consider sane from the user point of view).

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Kevin Grittner
Date:
Peter Geoghegan <pg@heroku.com> wrote:
> On Tue, Oct 7, 2014 at 5:23 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> IIRC it wasn't agreed that we needed to identify which indexes in the
>> upsert SQL statement itself, since this would be possible in other
>> ways and would require programmers to know which unique constraints
>> are declared.
>
> Kevin seemed quite concerned about that. That is something that seems
> hard to reconcile with supporting the MERGE syntax. Perhaps Kevin can
> comment on that, since he was in favor of both being able to specify
> user intent by accepting a unique index, while also being in favor of
> the MERGE syntax.

Well, I mostly wanted to make sure we properly considered what the
implications were of using the standard syntax without other
keywords or decorations before deciding to go the non-standard
route.  In spite of an alarming tendency for people to assume that
meant that I didn't understand the desired semantics, I feel enough
people have understood the question and weighed in in favor of an
explicit choice between semantics, rather than inferring
concurrency handling based on the availability of the index
necessary for the slicker behavior.  I'm willing to concede that
overall consensus is leaning toward the view that UPSERT semantics
should be conditioned on explicit syntax; I'll drop that much going
forward.

Granting that, I will say that I lean toward either the MERGE
syntax with CONCURRENTLY being the flag to use UPSERT semantics, or
a separate UPSERT command which is as close to identical to the
MERGE syntax (other than the opening verb) as possible.  I see that
as still needing the ON clause so that you can specify which values
match which columns from the target table.  I'm fine with starting
with the syntax in the standard, which has no DELETE or IGNORE
options (as of the latest version I've seen).  So the syntax I'm
suggesting is close to what Simon is suggesting, but a more
compliant form would be:

MERGE CONCURRENTLY INTO foo USING (VALUES (valuelist) aliases) ON (conditions) WHEN NOT MATCHED   INSERT [ (columnlist)
]VALUES (valuelist) WHEN MATCHED   UPDATE SET colname = expression [, ...]
 

Rather than pseudo-randomly picking a unique index or using a
constraint or index name, the ON condition would need to allow
matching based on equality to all columns of a unique index which
only referenced NOT NULL columns; we would pick an index which
matched those conditions.  In any event, the unique index would be
required if CONCURRENTLY was specified.  Using column matching to
pick the index (like we do when specifying a FOREIGN KEY
constraint) is more in keeping with other SQL statements, and seems
generally safer to me.  It would also make it fairly painless for
people to switch concurrency techniques for what is, after all,
exactly the same operation except for differences in handling of
concurrent conflicting DML.

As I said, I'm also OK with using UPSERT in place of MERGE
CONCURRENTLY.

I also feel that if we could allow:
 USING (VALUES (valuelist) [, ...])

that would be great.  In fact, I don't see why that can't be pretty
much any relation, but it doesn't have to be for a first cut.  A
relation would allow a temporary table to be loaded with a batch of
rows where the intent is to UPSERT every row in the batch, without
needing to write a loop to do it.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 8 October 2014 21:16, Peter Geoghegan <pg@heroku.com> wrote:

>> My opinion is that syntax for this should be similar to MERGE in the
>> *body* of the command, rather than some completely different syntax.
>> e.g.
>>
>>> WHEN NOT MATCHED THEN
>>>   INSERT
>>> WHEN MATCHED THEN
>>>  UPDATE
>>
>> I'm happy that we put that to a vote on what the syntax should be, as
>> long as we bear in mind that we will one day have MERGE as well.
>
> While I am also happy with taking a vote, if we do so I vote against
> even this much less MERGE-like syntax. It's verbose, and makes much
> less sense when the mechanism is driven by would-be duplicate key
> violations rather than an outer join.

It wouldn't be driven by an outer join, not sure where that comes from.

MERGE is verbose, I agree. I don't always like the SQL Standard, I
just think we should follow it as much as possible. You can't change
the fact that MERGE exists, so I don't see a reason to have two
variants of syntax that do roughly the same thing.

MERGE syntax would allow many things, such as this...
WHEN NOT MATCHED AND x > 7 THEN INSERT
WHEN NOT MATCHED THEN INSERT
WHEN MATCHED AND y = 5 THEN DO NOTHING
WHEN MATCHED THENUPDATE

etc

> I also like that when you UPSERT
> with the proposed ON CONFLICT UPDATE syntax, you get all the
> flexibility of an INSERT - you can use data-modifying CTEs, and nest
> the statement in a data-modifying CTE, and "INSERT ... SELECT... ON
> CONFLICT UPDATE ..." . And to be honest, it's much simpler to
> implement this whole feature as an adjunct to how INSERT statements
> are currently processed (during parse analysis, planning and
> execution); I don't want to make the syntax work against that.

I spoke to someone today that preferred a new command keyword, like
UPSERT, because the semantics of triggers are weird. Having a before
insert trigger fire when there is no insert is quite strange. Properly
documenting that on hackers would help; has the comments made on the
Django list been replayed here in some form?

I'm very scared by your comments about data modifying CTEs etc.. You
have no definition of how they will work, not tests of that. That part
isn't looking like a benefit as things currently stand.

I'm still waiting for some more docs to describe your intentions so
they can be reviewed.

Also, it would be useful to hear that your're going to do something
about the references to rows using conflicting(), since nobody has
agreed with you there. Or hopefully even that you've listened and
implemented something differently already. (We need that, whatever we
do with other elements of syntax).

Overall, I'm not seeing too many comments that indicate you are
accepting review comments and acting upon them. If you want acceptance
from others, you need to begin with some yourself.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Wed, Oct 8, 2014 at 2:04 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> While I am also happy with taking a vote, if we do so I vote against
>> even this much less MERGE-like syntax. It's verbose, and makes much
>> less sense when the mechanism is driven by would-be duplicate key
>> violations rather than an outer join.
>
> It wouldn't be driven by an outer join, not sure where that comes from.

Right, I understood that it wouldn't be - which is the point. So with
an UPSERT that retains influence from MERGE, NOT MATCHED means "no
conflict", MATCHED means "conflict". That just seems like an odd way
to spell the concept given, as you say, that we're not talking about
an outer join.

> MERGE is verbose, I agree. I don't always like the SQL Standard, I
> just think we should follow it as much as possible. You can't change
> the fact that MERGE exists, so I don't see a reason to have two
> variants of syntax that do roughly the same thing.
>
> MERGE syntax would allow many things, such as this...
> WHEN NOT MATCHED AND x > 7 THEN
>   INSERT
> WHEN NOT MATCHED THEN
>   INSERT
> WHEN MATCHED AND y = 5 THEN
>   DO NOTHING
> WHEN MATCHED THEN
>  UPDATE
>
> etc

But then you can have before row insert triggers fire, which as you
acknowledge is more surprising with this syntax.

> I spoke to someone today that preferred a new command keyword, like
> UPSERT, because the semantics of triggers are weird. Having a before
> insert trigger fire when there is no insert is quite strange. Properly
> documenting that on hackers would help; has the comments made on the
> Django list been replayed here in some form?

Yes. It's also mentioned in the commit message of CONFLICTING() (patch
0003-*). And the documentation (both the proposed INSERT
documentation, and the trigger documentation). There is a large
comment on it in the code. So I've said it many times.

> I'm very scared by your comments about data modifying CTEs etc.. You
> have no definition of how they will work, not tests of that. That part
> isn't looking like a benefit as things currently stand.

Actually, I have a few smoke tests for that. But I don't see any need
for special handling. When you have a data-modifying CTE, it can
contain an INSERT, and there are no special restrictions on that
INSERT (other than that it may not itself have a CTE, but that's true
more generally). You can have data-modifying CTEs containing INSERTs,
and data-modifying CTEs containing UPDATEs....what I've done is have
data-modifying CTEs contain INSERTs that also happen to have an ON
CONFLICT UPDATE clause.

This new clause of INSERTs is in no more need of special documentation
regarding interactions with data-modifying CTEs than UPDATE .... WHERE
CURRENT OF is. The only possible exception I can think of would be
cardinality violations where a vanilla INSERT in one part of a command
(one data-modifying CTE) gives problems to the "UPSERT part" of the
same command (because we give a special cardinality violation message
when we try to update the same tuple twice in the same command). But
that's a pretty imaginative complaint, and I doubt it would really
surprise someone.

Why would you be surprised by the fact that a new clause for INSERT
plays nicely with existing clauses? It's nothing special - there is no
special handling.

> I'm still waiting for some more docs to describe your intentions so
> they can be reviewed.

I think it would be useful to add several more isolation tests,
highlighting some of the cases you talked about. I'll work on that.
While the way forward for WITHIN isn't clear, I think a WITHIN PRIMARY
KEY variant would certainly be useful. Maybe it would be okay to
forget about naming a specific unique index, while supporting an
(optional) WITHIN PRIMARY KEY/NOT WITHIN PRIMARY KEY. It doesn't
totally solve the problems, but may be a good compromise that mostly
satisfies people that want to be able to clearly indicate user intent
(Kevin, in particular), and satisfies other people that don't want to
name a unique index (Heikki, in particular). Certainly, the Django
people would like that, since they said as much.

> Also, it would be useful to hear that your're going to do something
> about the references to rows using conflicting(), since nobody has
> agreed with you there. Or hopefully even that you've listened and
> implemented something differently already. (We need that, whatever we
> do with other elements of syntax).

Do you really expect me to do major work on some aspect of the syntax
like that, given, as you say, that nobody explicitly agreed with me
(and only you disagreed with me)? The only remark I heard on that was
from you (you'd prefer to use NEW.* and OLD.*). But you spent much
more time talking about getting something MERGE-like, which
NEW.*/OLD.* clearly isn't.

CONFLICTING() is very close (identical?) to MySQL's use of "ON
DUPLICATE KEY UPDATE val = VALUES(val)". I'm happy to discuss that,
but it's news to me that people take particular issue with it.

> Overall, I'm not seeing too many comments that indicate you are
> accepting review comments and acting upon them. If you want acceptance
> from others, you need to begin with some yourself.

What, specifically, have I failed to act on? We are discussing the
syntax here. I have very valid practical reasons for wanting to make
this feature a clause of INSERT. That is a view that Andres seemed to
agree with [1], for example.

[1] http://www.postgresql.org/message-id/20140929070235.GP1169@alap3.anarazel.de
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 8 October 2014 23:24, Peter Geoghegan <pg@heroku.com> wrote:

>> Also, it would be useful to hear that your're going to do something
>> about the references to rows using conflicting(), since nobody has
>> agreed with you there. Or hopefully even that you've listened and
>> implemented something differently already. (We need that, whatever we
>> do with other elements of syntax).
>
> Do you really expect me to do major work on some aspect of the syntax
> like that, given, as you say, that nobody explicitly agreed with me
> (and only you disagreed with me)? The only remark I heard on that was
> from you (you'd prefer to use NEW.* and OLD.*).
> But you spent much
> more time talking about getting something MERGE-like, which
> NEW.*/OLD.* clearly isn't.

Yes, it is. Look at the AS clause.

> CONFLICTING() is very close (identical?) to MySQL's use of "ON
> DUPLICATE KEY UPDATE val = VALUES(val)". I'm happy to discuss that,
> but it's news to me that people take particular issue with it.

3 people have asked you questions or commented about the use of
CONFLICTING() while I've been watching. It's clearly a non-standard
mechanism and not inline with other Postgres usage. Nobody actually
says "I object to this" - do they need to use that phrase before you
take note?

I'm beginning to feel that giving you review comments is being seen as
some kind of negative action. Needing to repeat myself makes it clear
that you aren't taking note.

Yes, I expect you to do these things
* collect other people's input, even if you personally disagree
* if there is disagreement amongst reviewers, seek to resolve that in
a fair and reasonable manner
* publish a summary of changes requested
* do major work to address them

So yes, I really expect that.

It doesn't matter that it is "only SImon" or "only Kevin". **One**
comment is enough for you to take note.

If there is disagreement, publishing the summary of changes you plan
to make in your next version will help highlight that.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Wed, Oct 8, 2014 at 10:49 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Do you really expect me to do major work on some aspect of the syntax
>> like that, given, as you say, that nobody explicitly agreed with me
>> (and only you disagreed with me)? The only remark I heard on that was
>> from you (you'd prefer to use NEW.* and OLD.*).
>> But you spent much
>> more time talking about getting something MERGE-like, which
>> NEW.*/OLD.* clearly isn't.
>
> Yes, it is. Look at the AS clause.

You can alias each of the two tables being joined. But I only have one
table, and no join. When you referred to NEW.* and OLD.*, you clearly
were making a comparison with trigger WHEN clauses, and not MERGE
(which is a comparison I made myself, although for more technical
reasons). It hardly matters, though.

>> CONFLICTING() is very close (identical?) to MySQL's use of "ON
>> DUPLICATE KEY UPDATE val = VALUES(val)". I'm happy to discuss that,
>> but it's news to me that people take particular issue with it.
>
> 3 people have asked you questions or commented about the use of
> CONFLICTING() while I've been watching.

Lots of people have asked me lots of questions. Again, as I said, I
wasn't aware that CONFLICTING() was a particular point of contention.
Please be more specific.

> It's clearly a non-standard
> mechanism and not inline with other Postgres usage.

What would be "inline with other Postgres usage"? I don't think you've
been clear on what you think is a better alternative.

I felt a function-like expression was appropriate because the user
refers to different tuples of the target table. It isn't like a join.
Plus it's similar to the MySQL thing, but doesn't misuse VALUES() as a
function-like thing.

> If there is disagreement, publishing the summary of changes you plan
> to make in your next version will help highlight that.

I think I've done a pretty good job of collecting and collating the
opinions of others, fwiw.
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 9 October 2014 07:27, Peter Geoghegan <pg@heroku.com> wrote:

> Please be more specific.

Do not use CONFLICTING() which looks like it is a function.

Instead, use a row qualifier, such as NEW, OLD etc to reference values
from the incoming data
e.g. CONFLICTING.value rather than CONFLICTING(value)

Do not use the word CONFLICTING since it isn't clear whether you are
referring to the row in the table or the value in the incoming data. I
suggest the use of two separately named row qualifiers to allow us to
use either of those when desired. I don't have suggestions as to what
you should call those qualifiers, though Postgres already uses NEW and
OLD in similar circumstances in triggers. (This has nothing at all to
do with the MERGE command in the SQL standard, so please don't mention
that here.)

You may also wish to support the AS keyword, as MERGE does to make the
above even more clear.

e.g. SET col = EXISTING.col + NEW.col

Thank you.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Oct 9, 2014 at 12:38 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Do not use CONFLICTING() which looks like it is a function.

So is ROW(). Or COALESCE().

> Instead, use a row qualifier, such as NEW, OLD etc to reference values
> from the incoming data
> e.g. CONFLICTING.value rather than CONFLICTING(value)
>
> Do not use the word CONFLICTING since it isn't clear whether you are
> referring to the row in the table or the value in the incoming data.

If you don't have a word that you think would more clearly indicate
the intent of the expression, I'm happy to hear suggestions from
others.

> You may also wish to support the AS keyword, as MERGE does to make the
> above even more clear.
>
> e.g. SET col = EXISTING.col + NEW.col

That's less clear, IMV. EXISTING.col is col - the very same Var. So
why qualify that it's the existing value in one place but not the
other? In fact, you can't do that now with updates in general:

postgres=# update upsert u set u.val = 'foo';
ERROR:  42703: column "u" of relation "upsert" does not exist
LINE 1: update upsert u set u.val = 'foo';                           ^
LOCATION:  transformUpdateStmt, analyze.c:2068

This does work, which is kind of what you outline:

postgres=# update upsert u set val = u.val;
UPDATE 3

But MERGE accepts the former in other systems (in general, and for
MERGE), where Postgres won't (for UPDATEs in general). Parse analysis
of UPDATE targetlists just rejects this outright.

FWIW, is any of the two tuples reference here "NEW", in any sense?
Informally, I'd say the new value is the resulting row - the final row
value after the UPDATE. We want to refer to the existing row, and the
row proposed for insertion (with all before trigger effects carried
forward).

Having the column reference go through an alias like this might be tricky.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Marti Raudsepp
Date:
On Thu, Oct 9, 2014 at 11:11 AM, Peter Geoghegan <pg@heroku.com> wrote:
> On Thu, Oct 9, 2014 at 12:38 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Do not use CONFLICTING() which looks like it is a function.
>
> So is ROW(). Or COALESCE().

ROW and COALESCE behave almost like functions: they operate on any
expression or value you pass to them.

db=# select coalesce('bar');coalesce
----------bar

Not so with CONFLICTING(), it only accepts a column name -- not a
value -- and has knowledge of the surrounding statement that ordinary
function-like constructs don't.

db=# INSERT into evt_type (name) values ('foo') on conflict UPDATE set
name=conflicting('bar');
ERROR:  syntax error at or near "'bar'"
LINE 1: ...lues ('foo') on conflict UPDATE set name=conflicting('bar');

> If you don't have a word that you think would more clearly indicate
> the intent of the expression, I'm happy to hear suggestions from
> others.

I also like NEW due to similarity with triggers, but I see your
concern about it not actually being "new".

Regards,
Marti



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 9 October 2014 09:11, Peter Geoghegan <pg@heroku.com> wrote:

>> You may also wish to support the AS keyword, as MERGE does to make the
>> above even more clear.
>>
>> e.g. SET col = EXISTING.col + NEW.col
>
> That's less clear, IMV. EXISTING.col is col - the very same Var. So
> why qualify that it's the existing value in one place but not the
> other? In fact, you can't do that now with updates in general:
>
> postgres=# update upsert u set u.val = 'foo';
> ERROR:  42703: column "u" of relation "upsert" does not exist
> LINE 1: update upsert u set u.val = 'foo';
>                             ^
> LOCATION:  transformUpdateStmt, analyze.c:2068

YES, which is exactly why I did not say this, I said something different.

> This does work, which is kind of what you outline:
>
> postgres=# update upsert u set val = u.val;
> UPDATE 3

YES, which is why I said it.

> But MERGE accepts the former in other systems (in general, and for
> MERGE), where Postgres won't (for UPDATEs in general). Parse analysis
> of UPDATE targetlists just rejects this outright.
>
> FWIW, is any of the two tuples reference here "NEW", in any sense?
> Informally, I'd say the new value is the resulting row - the final row
> value after the UPDATE. We want to refer to the existing row, and the
> row proposed for insertion (with all before trigger effects carried
> forward).

YES, which is why I specifically requested the ability to reference
"the incoming data".

Common sense interpretations make for quicker and easier discussions.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Oct 9, 2014 at 1:33 AM, Marti Raudsepp <marti@juffo.org> wrote:
> ROW and COALESCE behave almost like functions: they operate on any
> expression or value you pass to them.

Okay, then like CONFLICTING() is like many of the XML expressions.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Oct 9, 2014 at 1:41 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> YES, which is why I specifically requested the ability to reference
> "the incoming data".

My point is that people are not really inclined to use an alias in
UPDATEs in general when referring to the target. The thing that seems
special (and worthy of special qualification) is the reference to what
you call the "incoming data", and what I've called "tuples proposed
for insertion" (after being affected by any before row triggers).

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Oct 9, 2014 at 1:56 AM, Peter Geoghegan <pg@heroku.com> wrote:
> My point is that people are not really inclined to use an alias in
> UPDATEs in general when referring to the target. The thing that seems
> special (and worthy of special qualification) is the reference to what
> you call the "incoming data", and what I've called "tuples proposed
> for insertion" (after being affected by any before row triggers).

For simple cases, you might not even bother with CONFLICTING() - you
might find it easier to just repeat the constant in the INSERT and
UPDATE parts of the query.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
V 1.3 is attached. Like Simon, I think that it's premature to commit
to one particular value locking implementation. Personally, I think
that approach #2 is what we'll end up using, but it makes no sense to
not maintain both at once, since it requires relatively little effort
to do so. At the very least it's a useful tool for reviewers, who
would otherwise be denied the opportunity to test whether any given
concurrency problem was attributable to the value locking
implementation, or something else. So there are two variants of V 1.3
attached - one uses an unchanged value locking implementation #1,
while the other an unchanged implementation #2.

Highlights
========

* No more "WITHIN unique_index_name". There is a new syntax that
supersedes it. Unique index inference based on columns (or
expressions) is *mandatory* for the ON CONFLICT UPDATE variant. It
remains optional for the IGNORE variant, because I think someone could
very reasonably not care which unique index was implicated. As
discussed, this implies that partial unique indexes are no longer
supported (but expression indexes work just fine). This may be
revisited, but I suggest doing so in a later release. Example of
merging on the unique index on column "name":

postgres=# explain insert into capitals(name, population, altitude)
values ('Riga', 1000, 5) on conflict (name) update set altitude =
excluded(altitude);
                              QUERY PLAN
----------------------------------------------------------------------
 Insert on capitals  (cost=0.00..0.01 rows=1 width=0)
   ->  Result  (cost=0.00..0.01 rows=1 width=0)
   ->  Conflict Update on capitals  (cost=0.00..1.01 rows=1 width=52)
(3 rows)

When we cannot infer a unique index, it looks like this:

postgres=# explain insert into capitals(name, population, altitude)
values ('Riga', 1000, 5) on conflict (population) update set altitude
= excluded(altitude);
ERROR:  42P10: could not infer which unique index to use from
expressions/columns provided for ON CONFLICT
LINE 1: ...e, population, altitude) values ('Riga', 1000, 5) on conflic...
                                                             ^
HINT:  Partial unique indexes are not supported.
LOCATION:  transformConflictClause, parse_clause.c:2407


* No more planner kludges. Index paths are never created for the
auxiliary UPDATE plan to begin with. To be tidy, TID scan paths are
never added either. Unless the UPDATE will never proceed due to a
tautological predicate like "WHERE false", we're guaranteed to have a
"sequential scan" in a fairly principled way (as before, we just use
this with EvalPlanQual(); it's just an implementation detail). I am
not entirely qualified to say so, but I think that this makes my
modifications to the optimizer look quite reasonable. The kludge of
enforcing various restrictions (e.g. on subqueries appearing in the
UPDATE) in the optimizer is also removed. We prefer to enforce
everything during parse analysis. This also gets us better error
messages, which is nice.

* Sane (although limited) support for table inheritance and updatable
views. However, in general user-defined rules are unsupported - I
cannot see how that could be made sane. The IGNORE variant works for
updatable views, and for inheritance relations with children (provided
that there is no inference required, which effectively makes the
UPDATE variant unsupported). However, both IGNORE and UPDATE variants
work for relations that happen to be in an inheritance hierarchy,
provided they have no children. I think it's fine to only support the
IGNORE variant for relations with inheritance children, because even
when users have the "partitioning pattern" use-case, there is no
principled way of telling the difference between a vanilla INSERT, and
an INSERT with an ON CONFLICT UPDATE clause from within the custom
redirection trigger function. Also, unique indexes already only work
at the relation level with inheritance - that's a long-standing
limitation. And so, in general INSERTs better have the right
inheritance child as their target - this is no more and no less true
when there is an ON CONFLICT UPDATE clause (note that I'm talking
about the "object orientated" inheritance use-case here, and not the
"partitioning pattern" use-case - with the latter, it's all up to the
trigger function to do the right thing). There may currently be an
"UPDATE ONLY", but there is no "INSERT ONLY" - why should I add one?

* Both INSERT and UPDATE sets of statement level triggers fire for
INSERT with ON CONFLICT UPDATE. The number of rows affected doesn't
matter, nor does it matter how they were or were not affected. This
certainly seems like the correct behavior. Per-row triggers work the
same as before, since far more thought went into their behavior
earlier. This incorporates feedback from Kevin.

* CONFLICTING() is renamed to EXCLUDED(). I know that some people
wanted me to go a different way with this. I think that there are very
good practical reasons not to [1], as well as good reasons related to
design, but I still accept that CONFLICTING() isn't very descriptive.
This spelling seems a lot better.

Clean-up
=======

If you take a look at the EXPLAIN output above, you'll see that the
sequential scan node does not appear. Basically, I'm back to
suppressing the implementation detail of that never-executed
"sequential scan", but the new approach is far better than earlier
approaches. Certain things actually associated with the sequential
scan are now attributed to the parent (auxiliary) ModifyTable UPDATE
node. Example (incidentally, note that "key" is used to infer a unique
index to take as an arbiter index):

postgres=# explain insert into upsert values (1000, 'Plucky') on
conflict (key) update set val = excluded(val) where key = 5;
                             QUERY PLAN
---------------------------------------------------------------------
 Insert on upsert  (cost=0.00..0.01 rows=1 width=0)
   ->  Result  (cost=0.00..0.01 rows=1 width=0)
   ->  Conflict Update on upsert  (cost=0.00..25.38 rows=6 width=36)
         Filter: (key = 5)
(4 rows)

This seems reasonable to me. Including the "(never executed)"
sequential scan would be very confusing to users.

Inference
=======

Unique index inference (i.e. the way we figure out  *which* unique
index to use) occurs during parse analysis. I think it would be
inappropriate, and certainly inconvenient to do it during planning. I
maintain that ON CONFLICT DML statements have a legitimate semantic
dependence on particular unique indexes, which makes this appropriate.
I don't know about others, but my only problem with naming unique
indexes directly was that it is unmaintainable, and very ugly. In
principle, I think this semantic dependence is quite reasonable, and
reflects the reality of how we all expect this to work. There are
comments on the implications for plan caching and how that relates to
where and when we perform unique index inference.

Documentation
===========

The documentation has been updated, incorporating feedback. I also
made the cardinality violation error a lot clearer than before, since
Craig said that was unclear.

Tests
====

Many tests were added. I've added a couple of new isolation tests.
insert-conflict-update-3 should be of particular interest. That test
illustrates the visibility issues with the WHERE clause that I've
already highlighted as a possible concern [2]. Unique index inference
tests will also give you a fair idea of how flexible it is.

Remaining open items
=================

Apart from the obvious issue of value locking (i.e. verifying the
correctness of its implementation in general), the only open items
are:

* RLS needs to be considered. I have yet to give it any real thought.

* I could probably do better at postgres_fdw support. That seems like
something that could be followed up on later, because it's clearly
just about a Simple Matter of Programming.

In summary, I was able to remove a lot of TODO/FIXME items here -
almost all of them. I'm pretty happy about that. I'll have to edit the
UPSERT wiki page, to strike out many open items...

[1] http://www.postgresql.org/message-id/CAM3SWZQhiXQi1osT14V7spjQrUpmcnRtbXJe846-EB1bC+9i1g@mail.gmail.com
[2]
https://wiki.postgresql.org/wiki/UPSERT#Visibility_issues_and_the_proposed_syntax_.28WHERE_clause.2Fpredicate_stuff.29
--
Peter Geoghegan

Attachment

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Oct 23, 2014 at 6:43 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Documentation
> ===========
>
> The documentation has been updated, incorporating feedback. I also
> made the cardinality violation error a lot clearer than before, since
> Craig said that was unclear.


For the convenience of those that want to read the documentation
quickly to figure out one particular aspect of user-visible behavior,
the latest revision is uploaded here:
http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs

Of particular interest:

http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html

http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/transaction-iso.html#XACT-READ-COMMITTED
http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/trigger-definition.html

http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-createview.html#SQL-CREATEVIEW-UPDATABLE-VIEWS
http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/ddl-inherit.html

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Robert Haas
Date:
On Thu, Oct 23, 2014 at 9:43 PM, Peter Geoghegan <pg@heroku.com> wrote:
> * CONFLICTING() is renamed to EXCLUDED(). I know that some people
> wanted me to go a different way with this. I think that there are very
> good practical reasons not to [1], as well as good reasons related to
> design, but I still accept that CONFLICTING() isn't very descriptive.
> This spelling seems a lot better.

Specifically, "some people" included at least three committers and at
least one other community member no less prominent than yourself, or
in other words, every single person who bothered to comment.  You can
think whatever you like; the chances of it being committed that way
are zero.

> Unique index inference (i.e. the way we figure out  *which* unique
> index to use) occurs during parse analysis. I think it would be
> inappropriate, and certainly inconvenient to do it during planning.

You're wrong.  The choice of which index to use is clearly wildly
inappropriately placed in the parse analysis phase, and if you think
that has any chance of ever being committed by anyone, then you are
presuming the existence of a committer who won't mind ignoring the
almost-immediate revert request that would draw from, at the very
least, Tom.

As far as syntax goes, I thought the INSERT .. ON CONFLICT UPDATE
syntax proposed upthread was the best of any mentioned thus far.  The
MERGE-based syntaxes proposed upthread were crazily verbose for no
discernable benefit.

As much as I'd like to have this feature, your refusal to change
anything except when asked at least three times each by about five
different people makes this effort barely worth pursuing.  You can say
all you like that you're receptive to feedback, but multiple people
here are telling you that they feel otherwise.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Josh Berkus
Date:
On 10/24/2014 10:04 AM, Robert Haas wrote:
> As far as syntax goes, I thought the INSERT .. ON CONFLICT UPDATE
> syntax proposed upthread was the best of any mentioned thus far.  The
> MERGE-based syntaxes proposed upthread were crazily verbose for no
> discernable benefit.

For those of us who haven't followed every post in this thread, is there
somewhere I can see the proposed syntax?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Robert Haas
Date:
On Fri, Oct 24, 2014 at 1:18 PM, Josh Berkus <josh@agliodbs.com> wrote:
> On 10/24/2014 10:04 AM, Robert Haas wrote:
>> As far as syntax goes, I thought the INSERT .. ON CONFLICT UPDATE
>> syntax proposed upthread was the best of any mentioned thus far.  The
>> MERGE-based syntaxes proposed upthread were crazily verbose for no
>> discernable benefit.
>
> For those of us who haven't followed every post in this thread, is there
> somewhere I can see the proposed syntax?

There are a couple of different proposals but this should give you a
feeling for where we're at:

http://www.postgresql.org/message-id/CA+TgmoZN=2AJKi1n4Jz5BkmYi8r_CPUDW+DtoppmTeLVmsOoqw@mail.gmail.com

The part I like is the idea of making UPSERT look like an INSERT
statement with an additional clause that says how a unique violation
should be handled.  The part nobody except Peter likes is using
functional notation like CONFLICTING() or EXCLUDED() to pull in values
from the tuple causing the unique violation.  And there are some other
areas of disagreement about particular keywords and so on.  But I
personally like that general style much more than the alternatives
derived from MERGE.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Fri, Oct 24, 2014 at 10:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Oct 23, 2014 at 9:43 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> * CONFLICTING() is renamed to EXCLUDED(). I know that some people
>> wanted me to go a different way with this. I think that there are very
>> good practical reasons not to [1], as well as good reasons related to
>> design, but I still accept that CONFLICTING() isn't very descriptive.
>> This spelling seems a lot better.
>
> Specifically, "some people" included at least three committers and at
> least one other community member no less prominent than yourself, or
> in other words, every single person who bothered to comment.  You can
> think whatever you like; the chances of it being committed that way
> are zero.

This is the situation with unique index inference all over again. You
don't like a function-like expression. Okay. It would be a lot more
helpful if instead of just criticizing, you *also* looked at my
reasons for not wanting to go with something that would necessitate
adding a dummy range table entry [1]. There are some very good
practical reasons for avoiding that. We only do that (the OLD.* and
NEW.* stuff) in the context of CREATE RULE and one or two other
things. We don't play any games like that during parse analysis of
ordinary optimizable statements, which is what this is. And those
dummy RTEs are always placeholders, AFAICT. Apart from those technical
reasons, the two situations just aren't comparable from a user-visible
perspective, but that isn't the main problem.

I don't particularly expect you to come around to my view here. But it
would be nice to have the problems with dummy RTEs and so on
acknowledged, so it's understood that that is in itself a strange new
direction for parse analysis of ordinary optimizable statements.

>> Unique index inference (i.e. the way we figure out  *which* unique
>> index to use) occurs during parse analysis. I think it would be
>> inappropriate, and certainly inconvenient to do it during planning.
>
> You're wrong.  The choice of which index to use is clearly wildly
> inappropriately placed in the parse analysis phase, and if you think
> that has any chance of ever being committed by anyone, then you are
> presuming the existence of a committer who won't mind ignoring the
> almost-immediate revert request that would draw from, at the very
> least, Tom.

Why? This has nothing to do with optimization. You either have an
appropriate unique index available, in which case you can proceed (and
the cost associated with value locking the index and so on is an
inherent cost, kind of like inserting into an index in general). Or
you don't, and you can't (you get an error, not a sequential scan).
That's what I'd call a semantic dependence on the index. Yes, that's
kind of unusual, but it's the reality.

I can have the code do unique index inference during planning instead
if you insist, but I think that that isn't useful. If you want me to
do things that way, please explain why. Otherwise, even if I simply
acquiesce to your wishes, I may end up missing the point.

> As far as syntax goes, I thought the INSERT .. ON CONFLICT UPDATE
> syntax proposed upthread was the best of any mentioned thus far.  The
> MERGE-based syntaxes proposed upthread were crazily verbose for no
> discernable benefit.

I don't think anyone is too insistent on pursuing that. I think that
the INSERT .. ON CONFLICT UPDATE syntax has more or less emerged as
the favorite.

> As much as I'd like to have this feature, your refusal to change
> anything except when asked at least three times each by about five
> different people makes this effort barely worth pursuing.  You can say
> all you like that you're receptive to feedback, but multiple people
> here are telling you that they feel otherwise.

I'm tired of your assertions about why I haven't done certain things.
It's not fair. I have incorporated a lot of feedback into V1.3 (and
previous versions), which isn't acknowledged. Also, I moved to the USA
this week, and things have been hectic. As I said in relation to
unique index inference, please consider that I haven't immediately
complied with that feedback because there are genuine technical
obstacles that at the very least make it more difficult than it
appears. And, I'm not necessarily able to comply quickly because of
time constraints.

[1] http://www.postgresql.org/message-id/CAM3SWZQhiXQi1osT14V7spjQrUpmcnRtbXJe846-EB1bC+9i1g@mail.gmail.com
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Robert Haas
Date:
On Fri, Oct 24, 2014 at 3:01 PM, Peter Geoghegan <pg@heroku.com> wrote:
> This is the situation with unique index inference all over again. You
> don't like a function-like expression. Okay. It would be a lot more
> helpful if instead of just criticizing, you *also* looked at my
> reasons for not wanting to go with something that would necessitate
> adding a dummy range table entry [1].

The problem here isn't that I haven't read your emails.  I have read
them all, including that one.  Moreover, this isn't the first time
you've asserted that someone hasn't read one of your emails.  So if
we're going to say what we each think would be helpful, then I think
it would be helpful if you stopped accusing the people who are taking
time to provide feedback on your work of having failed to read your
emails.  It's possible that there may be instances where that problem
exists, but it beggars credulity to suppose that the repeated
unanimous consensus against some of your design decisions is entirely
an artifact of failure to pay attention.

The fact is, I don't feel obliged to respond to every one of your
emails, just as you don't respond to every one of mine.  If you want
this patch to ever get committed, it's your job to push it forward -
not mine, not Simon's, and not Heikki's.  Sometimes that means you
have to solve hard problems instead of just articulating what they
are.

> There are some very good
> practical reasons for avoiding that. We only do that (the OLD.* and
> NEW.* stuff) in the context of CREATE RULE and one or two other
> things. We don't play any games like that during parse analysis of
> ordinary optimizable statements, which is what this is. And those
> dummy RTEs are always placeholders, AFAICT. Apart from those technical
> reasons, the two situations just aren't comparable from a user-visible
> perspective, but that isn't the main problem.
>
> I don't particularly expect you to come around to my view here. But it
> would be nice to have the problems with dummy RTEs and so on
> acknowledged, so it's understood that that is in itself a strange new
> direction for parse analysis of ordinary optimizable statements.

You're conflating the user-visible syntax with the parse tree
representation in way that is utterly without foundation.  I don't
have a position at this point on which parse-analysis representation
is preferable, but it's completely wrong-headed to say that you
"can't" make NEW.x produce the same parse-analysis output that your
CONFLICTING(x) syntax would have created.  Sure, it might be harder,
but it's not that much harder, and it's definitely not an unsolvable
problem.

I do acknowledge that there might be a better syntax out there than
NEW.x and OLD.x.  I have not seen one proposed that I like better.
Feel free to propose something.  But don't keep re-proposing something
that LOOKSLIKE(this) because nobody - other than perhaps you - likes
that.  And don't use the difficulty of parse analysis as a
justification for your proposed syntax, because, except in extreme
cases, there are going to be very few if any regular contributors to
this mailing list who will accept that as a justification for one
syntax over another.  Syntax needs to be justified by being beautiful,
elegant, precedent-driven, orthogonal, and minimalist - not by whether
you might need an extra 25-75 lines of parse analysis code to make it
work.

>>> Unique index inference (i.e. the way we figure out  *which* unique
>>> index to use) occurs during parse analysis. I think it would be
>>> inappropriate, and certainly inconvenient to do it during planning.
>>
>> You're wrong.  The choice of which index to use is clearly wildly
>> inappropriately placed in the parse analysis phase, and if you think
>> that has any chance of ever being committed by anyone, then you are
>> presuming the existence of a committer who won't mind ignoring the
>> almost-immediate revert request that would draw from, at the very
>> least, Tom.
>
> Why? This has nothing to do with optimization.

That is false.  If there is more than one index that could be used,
the system should select the best one.  That is an optimization
decision per se.  Also, if a plan is saved - in the plancache, say, or
in a view - the query can be re-planned if the index it depends on is
dropped, but there's no way to do parse analysis.

>> As much as I'd like to have this feature, your refusal to change
>> anything except when asked at least three times each by about five
>> different people makes this effort barely worth pursuing.  You can say
>> all you like that you're receptive to feedback, but multiple people
>> here are telling you that they feel otherwise.
>
> I'm tired of your assertions about why I haven't done certain things.
> It's not fair. I have incorporated a lot of feedback into V1.3 (and
> previous versions), which isn't acknowledged. Also, I moved to the USA
> this week, and things have been hectic. As I said in relation to
> unique index inference, please consider that I haven't immediately
> complied with that feedback because there are genuine technical
> obstacles that at the very least make it more difficult than it
> appears. And, I'm not necessarily able to comply quickly because of
> time constraints.

Well, I'm equally tired of being asked to review patches that respond
to only a small percentage of the feedback already given.  I, too,
sometimes lack the time to incorporate the feedback of others into my
patches.  When that happens, I don't re-post them until I do have the
time.  I've even been known to drop patches altogether rather than
continue arguing about them, as you will of course recall.  There's no
shame in taking longer to get something done, but asking other people
to spend time on it when you haven't had time yourself can lead to
frustrations.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Fri, Oct 24, 2014 at 1:07 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> The problem here isn't that I haven't read your emails.  I have read
> them all, including that one.  Moreover, this isn't the first time
> you've asserted that someone hasn't read one of your emails.  So if
> we're going to say what we each think would be helpful, then I think
> it would be helpful if you stopped accusing the people who are taking
> time to provide feedback on your work of having failed to read your
> emails.  It's possible that there may be instances where that problem
> exists, but it beggars credulity to suppose that the repeated
> unanimous consensus against some of your design decisions is entirely
> an artifact of failure to pay attention.

Okay.

> The fact is, I don't feel obliged to respond to every one of your
> emails, just as you don't respond to every one of mine.  If you want
> this patch to ever get committed, it's your job to push it forward -
> not mine, not Simon's, and not Heikki's.  Sometimes that means you
> have to solve hard problems instead of just articulating what they
> are.

Okay.

> You're conflating the user-visible syntax with the parse tree
> representation in way that is utterly without foundation.  I don't
> have a position at this point on which parse-analysis representation
> is preferable, but it's completely wrong-headed to say that you
> "can't" make NEW.x produce the same parse-analysis output that your
> CONFLICTING(x) syntax would have created.  Sure, it might be harder,
> but it's not that much harder, and it's definitely not an unsolvable
> problem.

I don't believe I did. The broader point is that the difficulty in
making that work reflects the conceptually messiness, from
user-visible aspects down. I can work with the difficulty, and I may
even be able to live with the messiness, but I'm trying to bring the
problems with it to a head sooner rather than later for good practical
reasons. In all sincerity, my real concern is that you or the others
will change your mind when I actually go and implement an OLD.* style
syntax, and see the gory details. I regret it if to ask this is to ask
too much of you, but FYI that's the thought process behind it.

> I do acknowledge that there might be a better syntax out there than
> NEW.x and OLD.x.  I have not seen one proposed that I like better.
> Feel free to propose something.  But don't keep re-proposing something
> that LOOKSLIKE(this) because nobody - other than perhaps you - likes
> that.

Okay.

So in an UPDATE targetlist, you can assign DEFAULT to a column. Maybe
that's an interesting precedent. During rewriting, this gets rewritten
such that you end up with something that looks to the planner as if
the original query included a constant (this actually comes from a
catalog look-up for the column during rewriting). What if we spelled
EXCLUDING/CONFLICTING as follows:

INSERT INTO upsert VALUES(1, 'Art') ON CONFLICT (key) UPDATE SET val =
EXCLUDED || 'this works' WHERE another_col != EXCLUDED;

Then rewriting would figure these details out. From a design
perspective, there'd need to be a few details worked out about how
inference actually works - inferring *which* column the EXCLUDED
expression actually referred to, but it seems doable, especially given
the existing restrictions on the structure of the UPDATE. We're not
rewriting from a SetToDefault to a constant, but a SetToDefault-like
thing to a special Var (actually, the finished representation probably
makes it to the execution stage with that Var representation filled
in, unlike SetToDefault, but it's basically the same pattern). It
solves my problem with dummy range table entries. Actually, *any* new
kind of expression accomplishes this just as well. My concern here is
more around not needing cute tricks with dummy RTEs than it is around
being in favor of any particular expression-based syntax.

What do you think of that?

> And don't use the difficulty of parse analysis as a
> justification for your proposed syntax, because, except in extreme
> cases, there are going to be very few if any regular contributors to
> this mailing list who will accept that as a justification for one
> syntax over another.  Syntax needs to be justified by being beautiful,
> elegant, precedent-driven, orthogonal, and minimalist - not by whether
> you might need an extra 25-75 lines of parse analysis code to make it
> work.

Well, that isn't the case here. I'd much rather code my way out of a
disagreement with you.

>>>> Unique index inference (i.e. the way we figure out  *which* unique
>>>> index to use) occurs during parse analysis. I think it would be
>>>> inappropriate, and certainly inconvenient to do it during planning.
>>>
>>> You're wrong.  The choice of which index to use is clearly wildly
>>> inappropriately placed in the parse analysis phase, and if you think
>>> that has any chance of ever being committed by anyone, then you are
>>> presuming the existence of a committer who won't mind ignoring the
>>> almost-immediate revert request that would draw from, at the very
>>> least, Tom.
>>
>> Why? This has nothing to do with optimization.
>
> That is false.  If there is more than one index that could be used,
> the system should select the best one.  That is an optimization
> decision per se.  Also, if a plan is saved - in the plancache, say, or
> in a view - the query can be re-planned if the index it depends on is
> dropped, but there's no way to do parse analysis.

Generating index paths for the UPDATE is a waste of cycles.
Theoretically, there could be an (a, b, c) unique index and a (c,b,a)
unique index, and those two might have a non-equal cost to scan. But
that almost certainly isn't going to happen in practice, since that's
a rather questionable indexing strategy, and even when it does, you're
going to have to insert into all the unique indexes a good proportion
of the time anyway, making the benefits of that approach pale in
comparison to the costs. And that's just the cost in CPU cycles, and
not code complexity.

I don't know why you want to bring a cost model, or choice of indexes
into this. It simply isn't comparable to how the system comes up with
which index to use in all other contexts. Now, I could easily make all
this happen during planning, just to not have to argue with you, but I
think that doing so is less similar to how things already work, not
more similar. It certainly doesn't imply more code reuse, since
get_relation_info() is clearly quite unsuitable.

> Well, I'm equally tired of being asked to review patches that respond
> to only a small percentage of the feedback already given.  I, too,
> sometimes lack the time to incorporate the feedback of others into my
> patches.  When that happens, I don't re-post them until I do have the
> time.  I've even been known to drop patches altogether rather than
> continue arguing about them, as you will of course recall.  There's no
> shame in taking longer to get something done, but asking other people
> to spend time on it when you haven't had time yourself can lead to
> frustrations.

From my point of view, I spent a significant amount of time making the
patch more or less match your proposed design for unique index
inference. It is discouraging to hear that you think I'm not
cooperating with community process. I'm doing my best. I think it
would be a bad idea for me to not engage with the community for an
extended period at this point. There were plenty of other issues
address by V1.3 that were not the CONFLICTING()/EXCLUDING thing that
you highlighted (or the other thing you highlighted around where to do
unique index inference).
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Petr Jelinek
Date:
On 25/10/14 00:48, Peter Geoghegan wrote:
>> You're conflating the user-visible syntax with the parse tree
>> representation in way that is utterly without foundation.  I don't
>> have a position at this point on which parse-analysis representation
>> is preferable, but it's completely wrong-headed to say that you
>> "can't" make NEW.x produce the same parse-analysis output that your
>> CONFLICTING(x) syntax would have created.  Sure, it might be harder,
>> but it's not that much harder, and it's definitely not an unsolvable
>> problem.
>
> I don't believe I did. The broader point is that the difficulty in
> making that work reflects the conceptually messiness, from
> user-visible aspects down. I can work with the difficulty, and I may
> even be able to live with the messiness, but I'm trying to bring the
> problems with it to a head sooner rather than later for good practical
> reasons. In all sincerity, my real concern is that you or the others
> will change your mind when I actually go and implement an OLD.* style
> syntax, and see the gory details. I regret it if to ask this is to ask
> too much of you, but FYI that's the thought process behind it.
>

If you feel so strongly that it's wrong even though everybody else seems 
to prefer it and if you at the same time feel so strongly about people 
changing minds once you implement this, maybe the best way to convince 
us is to show us the implementation (at this point it would probably 
have taken less of your time than the argument did).

>
> So in an UPDATE targetlist, you can assign DEFAULT to a column. Maybe
> that's an interesting precedent. During rewriting, this gets rewritten
> such that you end up with something that looks to the planner as if
> the original query included a constant (this actually comes from a
> catalog look-up for the column during rewriting). What if we spelled
> EXCLUDING/CONFLICTING as follows:
>
> INSERT INTO upsert VALUES(1, 'Art') ON CONFLICT (key) UPDATE SET val =
> EXCLUDED || 'this works' WHERE another_col != EXCLUDED;
>
> Then rewriting would figure these details out. From a design
> perspective, there'd need to be a few details worked out about how
> inference actually works - inferring *which* column the EXCLUDED
> expression actually referred to, but it seems doable, especially given
> the existing restrictions on the structure of the UPDATE. We're not
> rewriting from a SetToDefault to a constant, but a SetToDefault-like
> thing to a special Var (actually, the finished representation probably
> makes it to the execution stage with that Var representation filled
> in, unlike SetToDefault, but it's basically the same pattern). It
> solves my problem with dummy range table entries. Actually, *any* new
> kind of expression accomplishes this just as well. My concern here is
> more around not needing cute tricks with dummy RTEs than it is around
> being in favor of any particular expression-based syntax.
>
> What do you think of that?
>

Ugh, you want to auto-magically detect what value is behind the EXCLUDED 
based on how/where it's used in the UPDATE? That seems like quite a bad 
idea.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Fri, Oct 24, 2014 at 4:39 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
> If you feel so strongly that it's wrong even though everybody else seems to
> prefer it and if you at the same time feel so strongly about people changing
> minds once you implement this, maybe the best way to convince us is to show
> us the implementation (at this point it would probably have taken less of
> your time than the argument did).

No, it wouldn't have - I don't think anyone believes that. Magic
addRangeTableEntryForRelation() calls are only used in the context of
one or two utility statements that have pretty limited scope. Support
for an OLD.* style syntax would have to exist at *all* stages of query
execution, from parse analysis through to rewriting, planning, and
execution. That's the difference here - this isn't a utility command.

>> So in an UPDATE targetlist, you can assign DEFAULT to a column. Maybe
>> that's an interesting precedent. During rewriting, this gets rewritten
>> such that you end up with something that looks to the planner as if
>> the original query included a constant (this actually comes from a
>> catalog look-up for the column during rewriting). What if we spelled
>> EXCLUDING/CONFLICTING as follows:
>>
>> INSERT INTO upsert VALUES(1, 'Art') ON CONFLICT (key) UPDATE SET val =
>> EXCLUDED || 'this works' WHERE another_col != EXCLUDED;
>>
>> Then rewriting would figure these details out. From a design
>> perspective, there'd need to be a few details worked out about how
>> inference actually works - inferring *which* column the EXCLUDED
>> expression actually referred to, but it seems doable, especially given
>> the existing restrictions on the structure of the UPDATE. We're not
>> rewriting from a SetToDefault to a constant, but a SetToDefault-like
>> thing to a special Var (actually, the finished representation probably
>> makes it to the execution stage with that Var representation filled
>> in, unlike SetToDefault, but it's basically the same pattern). It
>> solves my problem with dummy range table entries. Actually, *any* new
>> kind of expression accomplishes this just as well. My concern here is
>> more around not needing cute tricks with dummy RTEs than it is around
>> being in favor of any particular expression-based syntax.
>>
>> What do you think of that?
>>
>
> Ugh, you want to auto-magically detect what value is behind the EXCLUDED
> based on how/where it's used in the UPDATE? That seems like quite a bad
> idea.

That's *exactly* how DEFAULT works within UPDATE targetlists. There
might be a few more details to work out here, but not terribly many,
and that's going to be true no matter what. 95%+ of the time, it'll
just be "val = EXCLUDED" anyway.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Alvaro Herrera
Date:
Peter Geoghegan wrote:
> On Fri, Oct 24, 2014 at 4:39 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:

> > Ugh, you want to auto-magically detect what value is behind the EXCLUDED
> > based on how/where it's used in the UPDATE? That seems like quite a bad
> > idea.
> 
> That's *exactly* how DEFAULT works within UPDATE targetlists. There
> might be a few more details to work out here, but not terribly many,
> and that's going to be true no matter what. 95%+ of the time, it'll
> just be "val = EXCLUDED" anyway.

Petr's thought mirrors mine.  What are you going to do the other 5% of
the time?  Is there some other way to refer to the columns of the
"excluded" row?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Robert Haas
Date:
On Fri, Oct 24, 2014 at 6:48 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> You're conflating the user-visible syntax with the parse tree
>> representation in way that is utterly without foundation.  I don't
>> have a position at this point on which parse-analysis representation
>> is preferable, but it's completely wrong-headed to say that you
>> "can't" make NEW.x produce the same parse-analysis output that your
>> CONFLICTING(x) syntax would have created.  Sure, it might be harder,
>> but it's not that much harder, and it's definitely not an unsolvable
>> problem.
>
> I don't believe I did. The broader point is that the difficulty in
> making that work reflects the conceptually messiness, from
> user-visible aspects down. I can work with the difficulty, and I may
> even be able to live with the messiness, but I'm trying to bring the
> problems with it to a head sooner rather than later for good practical
> reasons. In all sincerity, my real concern is that you or the others
> will change your mind when I actually go and implement an OLD.* style
> syntax, and see the gory details. I regret it if to ask this is to ask
> too much of you, but FYI that's the thought process behind it.

I think what's more likely is that we'll complain about the way you
chose to implement it.  I don't believe your contention (in the other
email) that "Support
for an OLD.* style syntax would have to exist at *all* stages of query
execution, from parse analysis through to rewriting, planning, and
execution."  I think if your design for implementing that syntax
requires that, and your design for some other syntax doesn't require
that, then you're not thinking clearly enough about what needs to
happen in parse analysis.  Make the parse analysis for this syntax
emit the same representation that you would have had it emit in the
other syntax, and you won't have this problem.

> What if we spelled
> EXCLUDING/CONFLICTING as follows:
>
> INSERT INTO upsert VALUES(1, 'Art') ON CONFLICT (key) UPDATE SET val =
> EXCLUDED || 'this works' WHERE another_col != EXCLUDED;
>
> What do you think of that?

I am in complete agreement with the comments made by Petr and Alvaro.

>>>> You're wrong.  The choice of which index to use is clearly wildly
>>>> inappropriately placed in the parse analysis phase, and if you think
>>>> that has any chance of ever being committed by anyone, then you are
>>>> presuming the existence of a committer who won't mind ignoring the
>>>> almost-immediate revert request that would draw from, at the very
>>>> least, Tom.
>>>
>>> Why? This has nothing to do with optimization.
>>
>> That is false.  If there is more than one index that could be used,
>> the system should select the best one.  That is an optimization
>> decision per se.  Also, if a plan is saved - in the plancache, say, or
>> in a view - the query can be re-planned if the index it depends on is
>> dropped, but there's no way to do parse analysis.
>
> Generating index paths for the UPDATE is a waste of cycles.
> Theoretically, there could be an (a, b, c) unique index and a (c,b,a)
> unique index, and those two might have a non-equal cost to scan. But
> that almost certainly isn't going to happen in practice, since that's
> a rather questionable indexing strategy, and even when it does, you're
> going to have to insert into all the unique indexes a good proportion
> of the time anyway, making the benefits of that approach pale in
> comparison to the costs.

I don't care whether you actually generate index-paths or not, and in
fact I suspect it makes no sense to do so.  But I do care that you do
a cost comparison between the available indexes and pick the one that
looks cheapest.  If somebody's got a bloated index and builds a new
one, they will want it to get used even before they drop the old one.

Even if that weren't an issue, though, the fact that you can't
re-parse but you can re-plan is a killer point AFAICS. It means you
are borked if the statement gets re-executed after the index you
picked is dropped.

> From my point of view, I spent a significant amount of time making the
> patch more or less match your proposed design for unique index
> inference. It is discouraging to hear that you think I'm not
> cooperating with community process. I'm doing my best. I think it
> would be a bad idea for me to not engage with the community for an
> extended period at this point. There were plenty of other issues
> address by V1.3 that were not the CONFLICTING()/EXCLUDING thing that
> you highlighted (or the other thing you highlighted around where to do
> unique index inference).

I think this gets at a point that has been bugging me and, perhaps,
other people here.  You often post a new patch with some fixes but
without fixes for the issues that reviewers have indicated are
top-of-mind for them.  Sometimes, but not always, you say something
like "I know this doesn't fix X but I'd like comments on other aspects
of the patch".  Even when you do, though, it can be difficult to
overlook something that appears to be a fundamental structural problem
to comment on details, and sometimes it feels like that's what we're
being asked to do.  When I'm reviewing, I tend to find issues more or
less in proportion to the time I spend on the patch.  If things that I
complained about before haven't been fixed, I tend to find the same
ones again, but not necessarily all that much faster than I found them
the first time.  So that's not efficient for me.  I would not make an
absolute rule that no patch should ever be re-posted without
addressing every issue; I wouldn't be able to follow that rule in
every case myself.  But I try to follow it as often as I can, and I
would suggest that you try to lean quite a bit more firmly in that
direction.  I think you will make more progress, and spend less time
arguing with others.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Sat, Oct 25, 2014 at 8:12 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Generating index paths for the UPDATE is a waste of cycles.
>> Theoretically, there could be an (a, b, c) unique index and a (c,b,a)
>> unique index, and those two might have a non-equal cost to scan. But
>> that almost certainly isn't going to happen in practice, since that's
>> a rather questionable indexing strategy, and even when it does, you're
>> going to have to insert into all the unique indexes a good proportion
>> of the time anyway, making the benefits of that approach pale in
>> comparison to the costs.
>
> I don't care whether you actually generate index-paths or not, and in
> fact I suspect it makes no sense to do so.  But I do care that you do
> a cost comparison between the available indexes and pick the one that
> looks cheapest.  If somebody's got a bloated index and builds a new
> one, they will want it to get used even before they drop the old one.

That seems extremely narrow. I am about ready to just do what you say,
by costing the index based on something like relpages, but for the
record I see no point. If I see no point, and you're sure that there
is a point, then there is a danger that I'll miss the point, which
contributes to my giving push back. I know I said that already, but I
reiterate it once more for emphasis.

> Even if that weren't an issue, though, the fact that you can't
> re-parse but you can re-plan is a killer point AFAICS. It means you
> are borked if the statement gets re-executed after the index you
> picked is dropped.

That simply isn't the case. As specifically noted in comments in the
patch, relcache invalidation works in such a way as to invalidate the
query proper, even when only an index has been dropped. For a time
during development, the semantic dependence was more directly
represented by actually having extract_query_dependencies() extract
the arbitrating unique index pg_class Oid as a dependency for the
benefit of the plan cache, but I ultimately decided against doing
anything more than noting the potential future problem (the problem
that arises in a world where relcache invalidation is more selective).

But, I digress: I'll have inference occur during planning during the
next revision since you think that's important.

>> From my point of view, I spent a significant amount of time making the
>> patch more or less match your proposed design for unique index
>> inference. It is discouraging to hear that you think I'm not
>> cooperating with community process. I'm doing my best. I think it
>> would be a bad idea for me to not engage with the community for an
>> extended period at this point. There were plenty of other issues
>> address by V1.3 that were not the CONFLICTING()/EXCLUDING thing that
>> you highlighted (or the other thing you highlighted around where to do
>> unique index inference).
>
> I think this gets at a point that has been bugging me and, perhaps,
> other people here.  You often post a new patch with some fixes but
> without fixes for the issues that reviewers have indicated are
> top-of-mind for them.  Sometimes, but not always, you say something
> like "I know this doesn't fix X but I'd like comments on other aspects
> of the patch".  Even when you do, though, it can be difficult to
> overlook something that appears to be a fundamental structural problem
> to comment on details, and sometimes it feels like that's what we're
> being asked to do.

I think that it's accurate to say that I asked the community to do
that once, last year. I was even very explicit about it at the time.
I'm surprised that you think that's the case now, though. I learned my
lesson a little later: don't do that, because fellow contributors are
unwilling to go along with it, even for something as important as
this.

As recently as a few months ago, you wanted me to go a *completely*
different direction with this (i.e. attempt an UPDATE first). I'm
surprised that you're emphasizing the EXCLUDED()/CONFLICTING() thing
so much. Should I take it that I more or less have your confidence
that the way the patch fits together at a high level is sound?
Because, that's the only way I can imagine you'd think that the
EXCLUDED()/CONFLICTING() thing is of such fundamental importance at
this point. It is after all split out as a much smaller commit on top
of the main implementation (a commit which could easily be deferred).
While it's important, it is very clearly subsidiary to the overall
structure of my design, which I think that there has been precious
little discussion of on this thread (e.g. the whole way I use
EvalPlanQual(), the general idea of a special auxiliary plan that is
executed in a special way, etc). That concerns me, because I suspect
that the reason there has been next to no discussion of those aspects
is because they're particularly hard things to have an opinion on, and
yet are the things of immediate importance. Please correct me if I
have it wrong.

I am in no position to demand that you or anyone else discuss any
particular aspect of the patch, of course. I am just conveying my
concerns. Like all of us, I very much want to get this feature into
the next release of PostgreSQL.

> When I'm reviewing, I tend to find issues more or
> less in proportion to the time I spend on the patch.  If things that I
> complained about before haven't been fixed, I tend to find the same
> ones again, but not necessarily all that much faster than I found them
> the first time.  So that's not efficient for me.  I would not make an
> absolute rule that no patch should ever be re-posted without
> addressing every issue; I wouldn't be able to follow that rule in
> every case myself.  But I try to follow it as often as I can, and I
> would suggest that you try to lean quite a bit more firmly in that
> direction.  I think you will make more progress, and spend less time
> arguing with others.

I'll try. But let me point out that the *very first* thing you
complained about, in relation to version 1.0, was that the plan
structure was funny. V1.3 cleaned that up, making a "sequential scan"
always occur, and having EXPLAIN treat that as a strict implementation
detail, while still attributing some aspects of the hidden, unexecuted
"sequential scan" (e.g. the qual) to its parent where that makes
sense. This seems like something that squarely addressed your
*original* concern. A little later, you (rightly) complained about the
lack of support for inheritance, and to a lesser extent updatable
views. As of V1.3, I have reasonable support for both of those things.
And, of course, you wanted a unique index to be inferred from a set of
columns/expressions, which (most notably) v1.3 now does. Also, I
incorporated feedback from Kevin on some of those same items, as well
as the behavior of statement-level triggers. Furthermore, I
incorporated the feedback of Simon in having way more tests, in
particular, two new isolation tests, one of which illustrates the
qualitatively new "MVCC violation" in more detail.

In short, I think I've done a pretty good job of incorporating
feedback, and where I haven't I have been quite clear about it (it
certainly didn't take you long to figure it out in this most recent
instance). There is always room for improvement, but in my book V1.3
made a lot more than small, incremental progress. I am surprised by
your remarks suggesting that the progress of each revision is
excessively gradual in addressing your concerns. AFAICT, it's just
that V1.3 wasn't totally comprehensive in doing so. In reality, the
main reason for that is: getting this feature to a point where it
might plausibly be committed is bloody difficult, as evidenced by the
fact that it took this long for someone to produce a patch. Please
don't lose sight of that.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Robert Haas
Date:
On Sun, Oct 26, 2014 at 4:39 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> I don't care whether you actually generate index-paths or not, and in
>> fact I suspect it makes no sense to do so.  But I do care that you do
>> a cost comparison between the available indexes and pick the one that
>> looks cheapest.  If somebody's got a bloated index and builds a new
>> one, they will want it to get used even before they drop the old one.
>
> That seems extremely narrow. I am about ready to just do what you say,
> by costing the index based on something like relpages, but for the
> record I see no point. If I see no point, and you're sure that there
> is a point, then there is a danger that I'll miss the point, which
> contributes to my giving push back. I know I said that already, but I
> reiterate it once more for emphasis.

I don't think it's either narrow or difficult: I think you just need
to apply the existing index costing function.  CLUSTER does a similar
calculation to figure out whether to seq-scan or sort, even though it
doesn't use index paths per se, or at least I don't think it does.

>> Even if that weren't an issue, though, the fact that you can't
>> re-parse but you can re-plan is a killer point AFAICS. It means you
>> are borked if the statement gets re-executed after the index you
>> picked is dropped.
>
> That simply isn't the case. As specifically noted in comments in the
> patch, relcache invalidation works in such a way as to invalidate the
> query proper, even when only an index has been dropped. For a time
> during development, the semantic dependence was more directly
> represented by actually having extract_query_dependencies() extract
> the arbitrating unique index pg_class Oid as a dependency for the
> benefit of the plan cache, but I ultimately decided against doing
> anything more than noting the potential future problem (the problem
> that arises in a world where relcache invalidation is more selective).

I don't know what you mean by "invalidate the query proper".  Here's
my chain of reasoning: If the index selection occurs in parse
analysis, then it's embedded in the parse tree.  If this can used a
writable CTE, then the parse tree can get stored someplace.  If the
index is then dropped, the parse tree is no good any more.  Where's
the flaw in that reasoning?

I haven't looked at the patch to know exactly what will happen in that
case, but it seems to me like it must either not work or there must be
some ugly hack to make it work.

> But, I digress: I'll have inference occur during planning during the
> next revision since you think that's important.

Great.

> I think that it's accurate to say that I asked the community to do
> that once, last year. I was even very explicit about it at the time.
> I'm surprised that you think that's the case now, though. I learned my
> lesson a little later: don't do that, because fellow contributors are
> unwilling to go along with it, even for something as important as
> this.

I'm just telling you how I feel.  I can't vouch for it being perfectly
fair or accurate, though I do strive for that.

> As recently as a few months ago, you wanted me to go a *completely*
> different direction with this (i.e. attempt an UPDATE first). I'm
> surprised that you're emphasizing the EXCLUDED()/CONFLICTING() thing
> so much. Should I take it that I more or less have your confidence
> that the way the patch fits together at a high level is sound?

No.  Commenting on one aspect of a patch doesn't imply agreement with
other aspects of the patch.  Please don't put words into my mouth.  I
haven't reviewed this patch in detail; I've only commented on specific
aspects of it as they have arisen in discussion.  I may or may not
someday review it in detail, but not before I'm fairly confident that
the known issues raised by other community members have been addressed
as thoroughly as possible.

> In reality, the
> main reason for that is: getting this feature to a point where it
> might plausibly be committed is bloody difficult, as evidenced by the
> fact that it took this long for someone to produce a patch. Please
> don't lose sight of that.

I'm not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 27 October 2014 15:55, Robert Haas <robertmhaas@gmail.com> wrote:

> Commenting on one aspect of a patch doesn't imply agreement with
> other aspects of the patch.  Please don't put words into my mouth.  I
> haven't reviewed this patch in detail; I've only commented on specific
> aspects of it as they have arisen in discussion.  I may or may not
> someday review it in detail, but not before I'm fairly confident that
> the known issues raised by other community members have been addressed
> as thoroughly as possible.

+1

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Oct 27, 2014 at 9:43 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 27 October 2014 15:55, Robert Haas <robertmhaas@gmail.com> wrote:
>
>> Commenting on one aspect of a patch doesn't imply agreement with
>> other aspects of the patch.  Please don't put words into my mouth.  I
>> haven't reviewed this patch in detail; I've only commented on specific
>> aspects of it as they have arisen in discussion.  I may or may not
>> someday review it in detail, but not before I'm fairly confident that
>> the known issues raised by other community members have been addressed
>> as thoroughly as possible.
>
> +1

I wasn't putting words in anyone's mouth; I *don't* think that it's
true that Robert thinks patches 0001-* and 0002-* are perfectly fine,
and implied as much myself. I just think the *strongly* worded
disapproval of the user-visible interface of 0003-* was odd; it was
way out of proportion to its immediate importance to getting this
patch on track. AFAICT it was the *only* feedback that I didn't act on
with V1.3 (Robert's complaint about how inference happens during parse
analysis was a response to V1.3).

I'm not always going to be able to act on every item of feedback
immediately, or I'll have my own ideas about how to handle certain
things. I don't think that's all that big of a deal, since I've acted
on almost all feedback. I think by far the biggest problem here is the
lack of attention to the design from others.

I did a lot of copy-editing to the Wiki page yesterday. There are
actually few clear open items now:
https://wiki.postgresql.org/wiki/UPSERT#Open_Items

Some previous "open items" have been moved to here:
https://wiki.postgresql.org/wiki/UPSERT#Miscellaneous_odd_properties_of_proposed_ON_CONFLICT_patch

This is basically a section describing things that have not been
controversial or in need of adjusting, and may well never be, but I
wish we'd talk about because they're in some way novel or
counter-intuitive. This is the kind of things I'd like us to discuss
more.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 27 October 2014 17:44, Peter Geoghegan <pg@heroku.com> wrote:

> I did a lot of copy-editing to the Wiki page yesterday. There are
> actually few clear open items now:
> https://wiki.postgresql.org/wiki/UPSERT#Open_Items

I've read this page.

Please do these things, both of which have been requested multiple times...

1. Take the specific docs that relate to the patch and put them in one
place, so that everybody can read and understand and agree the
behaviour of the patch. So that someone reading that can see *exactly*
what is being proposed, not read through pages of other unchanged
material hoping to catch the few points that really differ.

2. Replace CONFLICTING() with what I have asked for in earlier posts.
The request is not repeated again, to avoid confusion.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Oct 27, 2014 at 11:12 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> 1. Take the specific docs that relate to the patch and put them in one
> place, so that everybody can read and understand and agree the
> behaviour of the patch. So that someone reading that can see *exactly*
> what is being proposed, not read through pages of other unchanged
> material hoping to catch the few points that really differ.

I'm afraid I don't follow. I have links to the user-visible
documentation (v1.3) on the Wiki:
https://wiki.postgresql.org/wiki/UPSERT#Documentation

The documentation is complete. I link to every interesting page from
the documentation directly from the Wiki, too. Of course, I also
describe the issues in more detail for those with an interest in the
implementation on the Wiki page itself (and list open issues). I have
isolation tests that illustrate the new facets of visibility for READ
COMMITTED, too.

How, specifically, have I failed to do what you ask here? If you want
to see exactly what has changed, in a differential fashion, well,
that's what a diff is for. I'm not aware of any existing way of
rendering to html for readability, while highlighting what is new.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Robert Haas
Date:
On Mon, Oct 27, 2014 at 1:44 PM, Peter Geoghegan <pg@heroku.com> wrote:
> I think by far the biggest problem here is the
> lack of attention to the design from others.

I find that attitude incredible.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Oct 27, 2014 at 1:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Oct 27, 2014 at 1:44 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> I think by far the biggest problem here is the
>> lack of attention to the design from others.
>
> I find that attitude incredible.

What I mean is that that's the thing that clearly needs scrutiny the
most. That isn't an attitude - it's a technical opinion.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Simon Riggs
Date:
On 27 October 2014 20:24, Peter Geoghegan <pg@heroku.com> wrote:
> On Mon, Oct 27, 2014 at 1:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Mon, Oct 27, 2014 at 1:44 PM, Peter Geoghegan <pg@heroku.com> wrote:
>>> I think by far the biggest problem here is the
>>> lack of attention to the design from others.
>>
>> I find that attitude incredible.
>
> What I mean is that that's the thing that clearly needs scrutiny the
> most. That isn't an attitude - it's a technical opinion.

Let's see if we can link these two thoughts.

1. You think the biggest problem is the lack of attention to the design.

2. I keep asking you to put the docs in a readable form.

If you can't understand the link between those two things, I am at a loss.

Good luck with the patch.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Oct 27, 2014 at 4:34 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Let's see if we can link these two thoughts.
>
> 1. You think the biggest problem is the lack of attention to the design.
>
> 2. I keep asking you to put the docs in a readable form.
>
> If you can't understand the link between those two things, I am at a loss.

You've read the docs. Please be clearer. In what sense are they not
readable? The main description of the feature appears on the INSERT
reference page:

http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html

These paragraphs were added (there is more on the INSERT reference
page than mentioned here, but this is the main part):

"""
The optional ON CONFLICT clause specifies a path to take as an
alternative to raising a uniqueness violation error. ON CONFLICT
IGNORE simply avoids inserting any individual row when it is
determined that a uniqueness violation error would otherwise need to
be raised. ON CONFLICT UPDATE has the system take an UPDATE path in
respect of such rows instead. ON CONFLICT UPDATE guarantees an atomic
INSERT or UPDATE outcome. While rows may be updated, the top-level
statement is still an INSERT, which is significant for the purposes of
statement-level triggers and the rules system. Note that in the event
of an ON CONFLICT path being taken, RETURNING returns no value in
respect of any not-inserted rows.

ON CONFLICT UPDATE optionally accepts a WHERE clause condition. When
provided, the statement only proceeds with updating if the condition
is satisfied. Otherwise, unlike a conventional UPDATE, the row is
still locked for update. Note that the condition is evaluated last,
after a conflict has been identified as a candidate to update.

ON CONFLICT UPDATE accepts EXCLUDED expressions in both its targetlist
and WHERE clause. This allows expressions (in particular, assignments)
to reference rows originally proposed for insertion. Note that the
effects of all per-row BEFORE INSERT triggers are carried forward.
This is particularly useful for multi-insert ON CONFLICT UPDATE
statements; when inserting or updating multiple rows, constants need
only appear once.

There are several restrictions on the ON CONFLICT UPDATE clause that
do not apply to UPDATE statements. Subqueries may not appear in either
the UPDATE targetlist, nor its WHERE clause (although simple
multi-assignment expressions are supported). WHERE CURRENT OF cannot
be used. In general, only columns in the target table, and excluded
values originally proposed for insertion may be referenced. Operators
and functions may be used freely, though.

INSERT with an ON CONFLICT UPDATE clause is a deterministic statement.
You cannot UPDATE any single row more than once. Writing such an
INSERT statement will raise a cardinality violation error. Rows
proposed for insertion should not duplicate each other in terms of
attributes constrained by the conflict-arbitrating unique index. Note
that the ordinary rules for unique indexes with regard to NULL apply
analogously to whether or not an arbitrating unique index indicates if
the alternative path should be taken, so multiple NULL values across a
set of rows proposed for insertion are acceptable; the statement will
always insert (assuming there is no unrelated error). Note that merely
locking a row (by having it not satisfy the WHERE clause condition)
does not count towards whether or not the row has been affected
multiple times (and whether or not a cardinality violation error is
raised).

ON CONFLICT UPDATE requires a column specification, which is used to
infer an index to limit pre-checking for duplicates to. ON CONFLICT
IGNORE makes this optional. Failure to anticipate and prevent would-be
uniqueness violations originating in some other unique index than the
single unique index that was anticipated as the sole source of
would-be uniqueness violations can result in failing to insert a row
(taking the IGNORE path) when a uniqueness violation may have actually
been appropriate; omitting the specification indicates a total
indifference to where any would-be uniqueness violation could occur.
Note that the ON CONFLICT UPDATE assignment may result in a uniqueness
violation, just as with a conventional UPDATE.

The rules for unique index inference are straightforward. Columns
and/or expressions specified must match all the columns/expressions of
some existing unique index on table_name. The order of the
columns/expressions in the index definition, or whether or not the
index definition specified NULLS FIRST or NULLS LAST, or the internal
sort order of each column (whether DESC or ASC were specified) are all
irrelevant. However, partial unique indexes are not supported as
arbiters of whether an alternative ON CONFLICT path should be taken.
Deferred unique constraints are also not accepted.

You must have INSERT privilege on a table in order to insert into it,
as well as UPDATE privilege if and only if ON CONFLICT UPDATE is
specified. If a column list is specified, you only need INSERT
privilege on the listed columns. Similarly, when ON CONFLICT UPDATE is
specified, you only need UPDATE privilege on the column(s) that are
listed to be updated, as well as SELECT privilege on any column whose
values are read in the ON CONFLICT UPDATE expressions or condition.
Use of the RETURNING clause requires SELECT privilege on all columns
mentioned in RETURNING. If you use the query clause to insert rows
from a query, you of course need to have SELECT privilege on any table
or column used in the query.
"""

On the same page, there is commentary on each new addition to the
grammar (e.g. expression_index, column_name_index, expression). There
are some worked examples, too (there are now 3 ON CONFLICT related
examples out of 12 total):

"""
Insert or update new distributors as appropriate. Assumes a unique
index has been defined that constrains values appearing in the did
column. Note that an EXCLUDED expression is used to reference values
originally proposed for insertion:
 INSERT INTO distributors (did, dname) VALUES (5, 'Gizmo transglobal'), (6, 'Doohickey, inc') ON CONFLICT (did) UPDATE
SETdname = EXCLUDED(dname) || ' (formerly ' || dname || ')'
 

Insert a distributor, or do nothing for rows proposed for insertion
when an existing, excluded row (a row with a matching constrained
column or columns after before row insert triggers fire) exists.
Assumes a unique index has been defined that constrains values
appearing in the did column (although since the IGNORE variant was
used, the specification of columns to infer a unique index from is not
mandatory):
 INSERT INTO distributors (did, dname) VALUES (7, 'Doodad GmbH') ON CONFLICT (did) IGNORE

Insert or update new distributors as appropriate. Assumes a unique
index has been defined that constrains values appearing in the did
column. WHERE clause is used to limit the rows actually updated (any
existing row not updated will still be locked, though):
 -- Don't update any existing row if it was already renamed at some -- earlier stage INSERT INTO distributors (did,
dname)VALUES (8, 'Thingamabob Distribution') ON CONFLICT (did) UPDATE SET dname = EXCLUDED(dname) || ' (formerly ' ||
dname|| ')' WHERE dname NOT LIKE '%(formerly %)'
 
"""

If you do not provide actionable feedback, then I cannot do anything.
That's frustrating for me. Saying that the docs are not in a "readable
form" is too vague for me to act on. If I'm not mistaken, they're in
the same form (the same medium - sgml doc patches) that they've been
for every patch ever submitted to PostgreSQL (in terms of being easy
to read against the backdrop of existing docs, which you mentioned
before). I have built the html docs and uploaded them myself, as an
additional convenience to reviewers.

What standard am I failing to meet here? I spent *hours* editing the
Wiki yesterday to make the structure clearer (little new content was
added), and make the information exactly current. I am bending over
backwards to make the user-visible behaviors clearer, and to document
the patch well, both from a user-visible perspective, and an internal
perspective. I have internal docs for both implementations of value
locking (#1 and #2). Then there's the value locking Wiki page. I have
produced *reams* of documentation at this point, all in an effort to
make things easier to understand. The user visible doc patch diff
stats:

---doc/src/sgml/ddl.sgml                 |  23 +++doc/src/sgml/indices.sgml             |  11 +-doc/src/sgml/mvcc.sgml
             |  43 ++++--doc/src/sgml/plpgsql.sgml             |  20 ++-doc/src/sgml/postgres-fdw.sgml        |   8
++doc/src/sgml/ref/create_index.sgml   |   7 +-doc/src/sgml/ref/create_rule.sgml     |   6
+-doc/src/sgml/ref/create_table.sgml   |   5 +-doc/src/sgml/ref/create_trigger.sgml  |   5
+-doc/src/sgml/ref/create_view.sgml    |  36 ++++-doc/src/sgml/ref/insert.sgml          | 258
++++++++++++++++++++++++++++++++--doc/src/sgml/ref/set_constraints.sgml|   6 +-doc/src/sgml/trigger.sgml             |
46++++--13 files changed, 426 insertions(+), 48 deletions(-)
 

#1 internal documentation stats:
doc/src/sgml/indexam.sgml        | 133 ++++++++++++++++++++++++++++++++++++---src/backend/access/nbtree/README |  90
+++++++++++++++++++++++++-src/backend/executor/README     |  35 +++++++++++3 files changed, 247 insertions(+), 11
deletions(-)

#2 internal documentation stats:

---src/backend/executor/README | 49 +++++++++++++++++++++++++++++++++++++++++++++1 file changed, 49 insertions(+)

Maybe those INSERT reference paragraphs could use another pass of
copy-editing, but that's not what you said - you suggested some far
more significant issue. Some other issues, like exactly how triggers
work with the feature are discussed on dedicated pages rather than the
main INSERT reference page, which seems consistent with the existing
documentation.

Have I not provided a total of 4 isolation tests illustrating
interesting concurrency/visibility interactions? That's a lot of
isolation tests. Here is the tests commit stat:
31 files changed, 1159 insertions(+), 8 deletions(-)

I really don't have any idea what you'd like me to do. Once again:
please be more specific. I don't know what you mean.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
> ---
>  doc/src/sgml/ddl.sgml                 |  23 +++
>  doc/src/sgml/indices.sgml             |  11 +-
>  doc/src/sgml/mvcc.sgml                |  43 ++++--
>  doc/src/sgml/plpgsql.sgml             |  20 ++-
>  doc/src/sgml/postgres-fdw.sgml        |   8 ++
>  doc/src/sgml/ref/create_index.sgml    |   7 +-
>  doc/src/sgml/ref/create_rule.sgml     |   6 +-
>  doc/src/sgml/ref/create_table.sgml    |   5 +-
>  doc/src/sgml/ref/create_trigger.sgml  |   5 +-
>  doc/src/sgml/ref/create_view.sgml     |  36 ++++-
>  doc/src/sgml/ref/insert.sgml          | 258 ++++++++++++++++++++++++++++++++--
>  doc/src/sgml/ref/set_constraints.sgml |   6 +-
>  doc/src/sgml/trigger.sgml             |  46 ++++--
>  13 files changed, 426 insertions(+), 48 deletions(-)
>
> #1 internal documentation stats:
>
>  doc/src/sgml/indexam.sgml        | 133 ++++++++++++++++++++++++++++++++++++---
>  src/backend/access/nbtree/README |  90 +++++++++++++++++++++++++-
>  src/backend/executor/README      |  35 +++++++++++
>  3 files changed, 247 insertions(+), 11 deletions(-)
>
> #2 internal documentation stats:
>
> ---
>  src/backend/executor/README | 49 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 49 insertions(+)

Just to put that in context, here are the documentation changes from
the original LATERAL commit:
doc/src/sgml/keywords.sgml                |   2 +-doc/src/sgml/queries.sgml                 |  83
+++++++++++++++++++++++++++++++++++++++++-doc/src/sgml/ref/select.sgml              | 102
+++++++++++++++++++++++++++++++++++++++++++++-------

Commit 0ef0b30 added data-modifying CTE docs (docs only). That looks like:
doc/src/sgml/queries.sgml    | 177
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++---doc/src/sgml/ref/select.sgml |  49 ++++++++++++++---2 files
changed,214 insertions(+), 12 deletions(-)
 

> Have I not provided a total of 4 isolation tests illustrating
> interesting concurrency/visibility interactions? That's a lot of
> isolation tests. Here is the tests commit stat:
>
>  31 files changed, 1159 insertions(+), 8 deletions(-)

And to put the tests in context, here are the stats from the original
Hot Standby commit:
src/test/regress/expected/hs_standby_allowed.out    |  215
++++++++++++++++++++++++++src/test/regress/expected/hs_standby_check.out      |   20
+++src/test/regress/expected/hs_standby_disallowed.out|  137
+++++++++++++++++src/test/regress/expected/hs_standby_functions.out |   40 +++++src/test/regress/pg_regress.c
           |   33 ++--src/test/regress/sql/hs_primary_extremes.sql        |   74
+++++++++src/test/regress/sql/hs_primary_setup.sql          |   25 +++src/test/regress/sql/hs_standby_allowed.sql
 |  121 +++++++++++++++src/test/regress/sql/hs_standby_check.sql           |   16
++src/test/regress/sql/hs_standby_disallowed.sql     |  105 +++++++++++++src/test/regress/sql/hs_standby_functions.sql
    |   24 +++src/test/regress/standby_schedule                   |   21 +++
 

So (at least as measured by raw lines of tests), this feature is
better tested than the original Hot Standby commit, and by a wide
margin. Tests also serve as an explanatory tool for the feature (in
particular, isolation tests can be used in this way).
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Oct 27, 2014 at 5:15 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> Let's see if we can link these two thoughts.
>>
>> 1. You think the biggest problem is the lack of attention to the design.
>>
>> 2. I keep asking you to put the docs in a readable form.
>>
>> If you can't understand the link between those two things, I am at a loss.
>
> You've read the docs. Please be clearer. In what sense are they not
> readable? The main description of the feature appears on the INSERT
> reference page:
>
> http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html

I've updated that reference page. I did a fair amount of copy-editing,
but also updated the docs to describe the latest (unpublished)
refinements to the syntax. Which is, as you and Robert requested, that
the target and rejected-for-insertion tuples may be referenced with
magical aliases in the style of OLD.* and NEW.*. I've spelt these
aliases as TARGET.* and EXCLUDED.*, since OLD.* and NEW.* didn't seem
to make much sense here. This requires some special processing during
rewriting (which, as you probably know, is true of DML statements in
general), and is certainly more invasive than what I had before, but
all told isn't too bad. Basically, there is still an ExcludedExpr, but
it only appears in the post-rewrite query tree, and is never created
by the raw grammar or processed during parse analysis.

I attach the doc patch with the relevant changes, in case you'd like a
quick reference to where things are changed.

I have already implemented the two things that you and Robert asked
for most recently: A costing model for unique index inference, and the
above syntax. I've also added IGNORE support to postgres_fdw (so you
can IGNORE if and only if a unique index inference specification is
omitted, just as with updatable views since V1.3).

Currently, I'm working on fixing an issue with RLS that I describe in
detail here:

https://wiki.postgresql.org/wiki/UPSERT#RLS

Once I fix that (provided it doesn't take too long), I'll publish a
V1.4. AFAICT, that'll close out all of the current open issues.

I hope this goes some way towards addressing your concerns.
--
Peter Geoghegan

Attachment

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Wed, Nov 5, 2014 at 1:09 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Once I fix that (provided it doesn't take too long), I'll publish a
> V1.4. AFAICT, that'll close out all of the current open issues.

Attached is V1.4. As with V1.3, I continue to maintain both approaches
to value locking in parallel, believing this to be the most useful
direction for development to take for the time being. The consensus is
for approach #2 to value locking [1], but I see no reason to deny
reviewers the chance to compare both approaches. It's easy to maintain
the two, as the value locking implementation is well encapsulated -
The executor level stuff that has been altered in the last few
revisions tends to cause very few or no conflicts when rebasing.

Highlights
=======

* Costing of indexes for the purposes of determining which to have
arbitrate whether or not the executor takes the alternative path. So,
a list of expressions is created during parse analysis, and that list
is matched against existing indexes during optimization. It's usually
possible to avoid the work of generating paths, because (it seems
reasonable to suppose) there is usually 0 or 1 possible indexes in
representative cases. If it's 0, we get an error, originating from
where we now do this work -- the optimizer.

* EXCLUDED.* (and TARGET.*) pseudo-aliases (compare OLD.* and NEW.* in
the context of user-defined rules and conditional triggers) are
visible within auxiliary UPDATE (but not parent INSERT). See the
commit message for details on how that works. In short, we still have
a dedicated primnode expression, ExcludedExpr, but it is not ever
generated by the raw grammar (it can only be added by the during the
rewriting stage of query processing). It's just a facade, but a
perfectly convincing one. Note that this means that Vars can be
referenced from "another RTE" in what is actually a relation scan node
of the target:

postgres=# explain INSERT INTO upsert values(1, 'foo') on conflict
(key) update set val = excluded.val where excluded.val != 'bar';
                               QUERY PLAN
------------------------------------------------------------------------
 Insert on upsert  (cost=0.00..0.01 rows=1 width=0)
   ->  Result  (cost=0.00..0.01 rows=1 width=0)
   ->  Conflict Update on upsert  (cost=0.00..32.99 rows=1591 width=36)
         Filter: ((excluded.val) <> 'bar'::text)
(4 rows)

Here, you're seeing a "Conflict Update" scan (actually, a quasi-hidden
sequential scan) on the upsert table that references a Var from the
facade excluded.* table/RTE. In fact, the Var is on the target table,
but read through our internal expression primnode (ExcludedExpr) so as
to get access to the excluded-from-insertion tuple slot during EPQ
expression evaluation for the UPDATE.

* postgres_fdw support for the IGNORE variant (provided there was no
unique index inference specification - just as with updatable views).

* Documentation clean-up - as I mentioned, I tried to address Simon's
concerns here. Also, as you'd expect, the documentation has been fixed
up to reflect the new syntax. I'll need to take a pass at updating the
UPSERT Wiki page soon, too.

Next steps
========

AFAICT, this revision addresses all open items bar one - the RLS bug,
which I could not decide on a fix for. I refer to the RLS issue
described on the Wiki [2].  As I mentioned before, I'd really like to
get some reviewer time on the executor level aspects of this, which
are relatively new, and have received no scrutiny from anyone else
that I'm aware of. This list of items is a good place to start, for
those that are interested:

https://wiki.postgresql.org/wiki/UPSERT#Miscellaneous_odd_properties_of_proposed_ON_CONFLICT_patch

My use of the EvalPlanQual() mechanism, and the structure of the plan
tree in general could really use some scrutiny too.

Thanks

[1] https://wiki.postgresql.org/wiki/Value_locking#.232._.22Promise.22_heap_tuples_.28Heikki_Linnakangas.29
[2] https://wiki.postgresql.org/wiki/UPSERT#RLS
--
Peter Geoghegan

Attachment

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Nov 10, 2014 at 3:33 PM, Peter Geoghegan <pg@heroku.com> wrote:
> * Documentation clean-up - as I mentioned, I tried to address Simon's
> concerns here. Also, as you'd expect, the documentation has been fixed
> up to reflect the new syntax. I'll need to take a pass at updating the
> UPSERT Wiki page soon, too.

I should mention that I've updated the Wiki to reflect the current,
post-v1.4 state of affairs. That remains the best place to get a
relatively quick overview of where things stand with open items,
discussion of the patch, etc:

https://wiki.postgresql.org/wiki/UPSERT

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Nov 10, 2014 at 3:33 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Attached is V1.4.

Someone mentioned to me privately that they weren't sure that the
question of whether or not RETURNING only projected actually inserted
tuples was the right one. Also, I think someone else mentioned this a
few months back. I'd like to address this question directly sooner
rather than later, and so I've added a note on the Wiki page in
relation to this [1]. It's a possible area of concern at this point.

Anyway, it wouldn't require much implementation effort to change the
behavior so that updated tuples were also projected. In addition, we
might also consider the necessity of inventing a mechanism to make
apparent whether the tuple was inserted or updated. The discussion
needs to happen first, though.

[1] https://wiki.postgresql.org/wiki/UPSERT#RETURNING_behavior
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Andreas Karlsson
Date:
On 11/20/2014 01:52 AM, Peter Geoghegan wrote:
> On Mon, Nov 10, 2014 at 3:33 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Also, I think someone else mentioned this a few months back.

Yeah, that was me.

I think we have three options.

1. Return only inserted tuples
2. Return inserted and updated tuples
3. Return inserted, updated and skipped tuples

To me option 1 is surprising and less useful since I imagine in most 
cases where you do an upsert you do not care if the tuple was inserted 
or updated as long as it has the right values after the upsert, and 
these values is also what I would expect to be returned.

The possible use case I see for option 3 is when you want the values of 
automatically generated columns but there is actually no work to do if 
another transaction had already inserted the same row (same according to 
the unique constraints). But this behavior even though useful in certain 
cases might be surprising.

Andreas




Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Wed, Nov 19, 2014 at 5:37 PM, Andreas Karlsson <andreas@proxel.se> wrote:
> I think we have three options.
>
> 1. Return only inserted tuples
> 2. Return inserted and updated tuples
> 3. Return inserted, updated and skipped tuples
>
> To me option 1 is surprising and less useful since I imagine in most cases
> where you do an upsert you do not care if the tuple was inserted or updated
> as long as it has the right values after the upsert, and these values is
> also what I would expect to be returned.

I can see why you'd say that about option 1. That also seems like an
argument against surfacing the distinction directly (through a
dedicated hidden column or other expressing that RETURNING might
reference, say).

> The possible use case I see for option 3 is when you want the values of
> automatically generated columns but there is actually no work to do if
> another transaction had already inserted the same row (same according to the
> unique constraints). But this behavior even though useful in certain cases
> might be surprising.

I think that 3 is out. It seems hard to justify not RETURNING anything
in respect of a slot when there is a before row insert trigger that
returns NULL on the one hand, but RETURNING something despite not
inserting for ON CONFLICT UPDATE on the other.

I think if we do this, we're also going to have to set a command tag.
That could look like this:

postgres=# INSERT INTO upsert values(1, 'Foo'), (2, 'Bar') ON CONFLICT
(key) UPDATE SET val = EXCLUDED.val;
INSERT 0 1 UPDATE 1

Or perhaps like this:

postgres=# INSERT INTO upsert values(1, 'Foo'), (2, 'Bar') ON CONFLICT
(key) UPDATE SET val = EXCLUDED.val;
UPSERT 0 2

Maybe the latter is better, because it's less likely to break tools
that currently parse the command tag. But if we went with the former
command tag format, we'd have to figure out if there should always be
an "UPDATE part" of INSERT command tags generally, even when there was
no ON CONFLICT UPDATE clause. I guess in that case it would have to
become stable/consistent across INSERTs, so we'd always have an
"UPDATE part", but I'm not sure.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Wed, Nov 19, 2014 at 6:04 PM, Peter Geoghegan <pg@heroku.com> wrote:
> I think that 3 is out. It seems hard to justify not RETURNING anything
> in respect of a slot when there is a before row insert trigger that
> returns NULL on the one hand, but RETURNING something despite not
> inserting for ON CONFLICT UPDATE on the other.

I mean: despite not inserting *or updating*.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Anssi Kääriäinen
Date:
On Wed, 2014-11-19 at 16:52 -0800, Peter Geoghegan wrote:
> Someone mentioned to me privately that they weren't sure that the
> question of whether or not RETURNING only projected actually inserted
> tuples was the right one. Also, I think someone else mentioned this a
> few months back. I'd like to address this question directly sooner
> rather than later, and so I've added a note on the Wiki page in
> relation to this [1]. It's a possible area of concern at this point.

I think the biggest problem with the current approach is that there is
no way to know if a row was skipped by the where clause when using
INSERT ON CONFLICT UPDATE ... WHERE.

I am a developer of the Django ORM. Django reports to the user whether a
row was inserted or updated. It is possible to know which rows were
inserted by returning the primary key value. If something is returned,
then it was an insert. If Django implements updated vs inserted checking
this way, then if PostgreSQL adds RETURNING for update case later on,
that would be a breaking change for Django.

So, if it is not too hard to implement RETURNING for the update case
then I think it should be done. A pseudo column informing if the result
was an update or insert would then be a requirement for Django. Changing
the returning behavior in later releases might cause problems due to
backwards compatibility.
- Anssi





Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Wed, Nov 19, 2014 at 10:37 PM, Anssi Kääriäinen
<anssi.kaariainen@thl.fi> wrote:
> I think the biggest problem with the current approach is that there is
> no way to know if a row was skipped by the where clause when using
> INSERT ON CONFLICT UPDATE ... WHERE.

Well, there could have always been an UPDATE in a trigger or something
like that.

> I am a developer of the Django ORM. Django reports to the user whether a
> row was inserted or updated. It is possible to know which rows were
> inserted by returning the primary key value. If something is returned,
> then it was an insert. If Django implements updated vs inserted checking
> this way, then if PostgreSQL adds RETURNING for update case later on,
> that would be a breaking change for Django.

How does that actually work at the moment? Do you use RETURNING, or
look at the command tag? Would you be happy to just know that certain
rows were either inserted or updated in the context of an UPSERT (and
not cancelled by a BEFORE ROW INSERT or UPDATE trigger returning
NULL), or do you want to specifically know if there was an insert or
an update in respect of each row/slot processed?

--
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Nov 20, 2014 at 1:42 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Would you be happy to just know that certain
> rows were either inserted or updated in the context of an UPSERT (and
> not cancelled by a BEFORE ROW INSERT or UPDATE trigger returning
> NULL)

Of course, having the WHERE clause in the auxiliary UPDATE not pass
would also be cause to *not* return/project the not-processed row/slot
(in a world where we do something with RETURNING in respect of rows
actually processed by the auxiliary UPDATE). I mean, you're seeing the
final version of the row when RETURNING with an UPDATE, and if the
UPDATE is never evaluated, then the would-be final version (which is
generally based on the TARGET tuple and EXLCUDED tuple, as processed
by the UPDATE) never exists, and so clearly cannot be projected by
RETURNING.

This explanation a tiny bit misleading, because the rows/slots not
affected by the UPDATE (or INSERT) are still *locked*, even when the
UPDATE's WHERE clause does not pass - they have been processed to the
extent that they were locked. This is also true of postgres_fdw in
certain situations, but it seems like a very minor issue.
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Anssi Kääriäinen
Date:
On Thu, 2014-11-20 at 13:42 -0800, Peter Geoghegan wrote:
> > I am a developer of the Django ORM. Django reports to the user whether a
> > row was inserted or updated. It is possible to know which rows were
> > inserted by returning the primary key value. If something is returned,
> > then it was an insert. If Django implements updated vs inserted checking
> > this way, then if PostgreSQL adds RETURNING for update case later on,
> > that would be a breaking change for Django.
> 
> How does that actually work at the moment? Do you use RETURNING, or
> look at the command tag? Would you be happy to just know that certain
> rows were either inserted or updated in the context of an UPSERT (and
> not cancelled by a BEFORE ROW INSERT or UPDATE trigger returning
> NULL), or do you want to specifically know if there was an insert or
> an update in respect of each row/slot processed?

Django uses the command tag currently to check if a row was updated. We
also use RETURNING to get SERIAL values back from the database on
insert.

The most likely place to use this functionality in Django is
Model.save(). This method is best defined as "make sure this object's
state is either inserted or updated to the database by the primary key
of the object". The Model.save() method needs to also report if the
model was created or updated. The command tag is sufficient for this
case.

So, the proposed feature now has everything Django needs for
Model.save().

Django might add a bulk_merge(objs) command later on. This is defined as
"make sure each obj in objs is persisted to the database using the
fastest way available". The INSERT ON CONFLICT UPDATE command looks
excellent for this case. In this case it will be more problematic to
check which rows were inserted, which update, as we need information for
each primary key value separately for this case.

When I think of this feature outside of Django, it seems it is
completely reasonable to return database computed values on UPSERT. This
requires two queries with the proposed API. My opinion is that RETURNING
for the update case is better than not having it.
- Anssi




Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Nov 20, 2014 at 10:58 PM, Anssi Kääriäinen
<anssi.kaariainen@thl.fi> wrote:
> Django uses the command tag currently to check if a row was updated. We
> also use RETURNING to get SERIAL values back from the database on
> insert.
>
> The most likely place to use this functionality in Django is
> Model.save(). This method is best defined as "make sure this object's
> state is either inserted or updated to the database by the primary key
> of the object". The Model.save() method needs to also report if the
> model was created or updated. The command tag is sufficient for this
> case.
>
> So, the proposed feature now has everything Django needs for
> Model.save().

So, to be clear, it would be okay if the command tag reported number
of rows *upserted*, without making any distinction between whether
they were actually inserted or updated? That seems like the least
invasive thing, since top level commands historically report a single
number of affected rows (FWIW I wouldn't report the first OID of a
single inserted tuple, as currently happens with the INSERT command
tag). The ON CONFLICT IGNORE variant would still report number of rows
inserted, though. RETURNING would show the resulting rows from either
the insert or update for each slot processed to completion (i.e.
actually inserted or updated) by the upsert, without making any
user-visible distinction (you asked for an upsert, so you must not
care).

> Django might add a bulk_merge(objs) command later on. This is defined as
> "make sure each obj in objs is persisted to the database using the
> fastest way available". The INSERT ON CONFLICT UPDATE command looks
> excellent for this case. In this case it will be more problematic to
> check which rows were inserted, which update, as we need information for
> each primary key value separately for this case.

Cool.

> When I think of this feature outside of Django, it seems it is
> completely reasonable to return database computed values on UPSERT. This
> requires two queries with the proposed API. My opinion is that RETURNING
> for the update case is better than not having it.

I am almost convinced that that behavior is better. I would like to
hear more opinions before looking at adding the necessary changes to
the implementation, though. It might be a bit questionable that ON
CONFLICT IGNORE and ON CONFLICT UPDATE have different command tags,
for example.

What do other people think? Should RETURNING project updated tuples as
well as inserted tuples, as described here?

--
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Robert Haas
Date:
On Fri, Nov 21, 2014 at 3:38 PM, Peter Geoghegan <pg@heroku.com> wrote:
> What do other people think? Should RETURNING project updated tuples as
> well as inserted tuples, as described here?

I think it should.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Nov 24, 2014 at 6:26 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Nov 21, 2014 at 3:38 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> What do other people think? Should RETURNING project updated tuples as
>> well as inserted tuples, as described here?
>
> I think it should.

Looks like the consensus is that we should have RETURNING project
updated tuples too, then.

I've already written the code to do this (and to report an "UPSERT"
command tag), which is very straightforward. The next revision will
have this behavior. However, I'm going to wait a little while longer
before formally publishing a new revision.
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Nov 24, 2014 at 1:03 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Looks like the consensus is that we should have RETURNING project
> updated tuples too, then.

Attached revision, v1.5, establishes this behavior (as always, there
is a variant for each approach to value locking). There is a new
commit with a commit message describing the new RETURNING/command tag
behavior in detail, so no need to repeat it here. The documentation
has been updated in these areas, too.

There is also one or two tiny comment tweaks here and there, as well
as a pg_proc OID collision fix in the case of the value locking
approach #1 variant.

My mirror of the documentation (i.e. a html build) has been updated.
INSERT command documentation (for new RETURNING behavior):

http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html

Details of changes to command tag:

http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/protocol-message-formats.html

I'll make a pass at the Wiki page to reflect these changes soon.
--
Peter Geoghegan

Attachment

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Anssi Kääriäinen
Date:
On Wed, 2014-11-26 at 16:59 -0800, Peter Geoghegan wrote:
> On Mon, Nov 24, 2014 at 1:03 PM, Peter Geoghegan <pg@heroku.com> wrote:
> > Looks like the consensus is that we should have RETURNING project
> > updated tuples too, then.
> 
> Attached revision, v1.5, establishes this behavior (as always, there
> is a variant for each approach to value locking). There is a new
> commit with a commit message describing the new RETURNING/command tag
> behavior in detail, so no need to repeat it here. The documentation
> has been updated in these areas, too.

It seems there isn't any way to distinguish between insert and update of
given row. Maybe a pseudo-column can be added so that it can be used in
the returning statement:
 insert into foobar(id, other_col) values(2, '2') on conflict (id) update set other_col=excluded.other_col returning
id,pseudo.was_updated;
 

This would ensure that users could check for each primary key value if
the row was updated or inserted.

Of course, the pseudo.was_updated name should be replaced by something
better.

It would be nice to be able to skip updates of rows that were not
changed:
  insert into foobar values(2, '2') on conflict (id) update set other_col=excluded.other_col where target is distinct
fromexcluded;
 
- Anssi Kääriäinen





Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Craig Ringer
Date:
On 12/04/2014 07:07 PM, Anssi Kääriäinen wrote:
> On Wed, 2014-11-26 at 16:59 -0800, Peter Geoghegan wrote:
>> On Mon, Nov 24, 2014 at 1:03 PM, Peter Geoghegan <pg@heroku.com> wrote:
>>> Looks like the consensus is that we should have RETURNING project
>>> updated tuples too, then.
>>
>> Attached revision, v1.5, establishes this behavior (as always, there
>> is a variant for each approach to value locking). There is a new
>> commit with a commit message describing the new RETURNING/command tag
>> behavior in detail, so no need to repeat it here. The documentation
>> has been updated in these areas, too.
> 
> It seems there isn't any way to distinguish between insert and update of
> given row. Maybe a pseudo-column can be added so that it can be used in
> the returning statement

Yes, I think that's pretty important. With a negative attno so it's
treated as a "hidden" col that must be explicitly named to be shown and
won't be confused with user columns.


-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Dec 4, 2014 at 3:04 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
> Yes, I think that's pretty important. With a negative attno so it's
> treated as a "hidden" col that must be explicitly named to be shown and
> won't be confused with user columns.

I think that the standard for adding a new system attribute ought to
be enormous. The only case where a new one was added post-Postgres95
was "tableoid". I'm pretty sure that others aren't going to want to do
it that way. Besides, I'm not entirely convinced that this is actually
an important distinction to expose.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Anssi Kääriäinen
Date:
On Thu, 2014-12-04 at 10:27 -0800, Peter Geoghegan wrote:

> I think that the standard for adding a new system attribute ought to
> be enormous. The only case where a new one was added post-Postgres95
> was "tableoid". I'm pretty sure that others aren't going to want to do
> it that way. Besides, I'm not entirely convinced that this is actually
> an important distinction to expose.

For Django's use case this is a requirement. We must inform the user if
the save() action created a new row or if it modified an existing one.

Another way to do this would be to expose the "excluded" alias in the
returning clause. All columns of the excluded alias would be null in
the case of insert (especially the primary key column), and thus if a
query   insert into foobar values(2, '2') on conflict (id) update set other_col=excluded.other_col returning
excluded.id
returns a non-null value, then it was an update.
- Anssi





Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Dec 4, 2014 at 10:27 PM, Anssi Kääriäinen
<anssi.kaariainen@thl.fi> wrote:
> For Django's use case this is a requirement. We must inform the user if
> the save() action created a new row or if it modified an existing one.

Can you explain that in more detail, please?

> Another way to do this would be to expose the "excluded" alias in the
> returning clause. All columns of the excluded alias would be null in
> the case of insert (especially the primary key column), and thus if a
> query
>     insert into foobar values(2, '2') on conflict (id) update set other_col=excluded.other_col returning excluded.id
> returns a non-null value, then it was an update.

I don't like that idea much, TBH. Consider this:

postgres=# update upsert u set key = 1 from upsert i returning key;
ERROR:  42702: column reference "key" is ambiguous
LINE 1: update upsert u set key = 1 from upsert i returning key;
  ^ 

So, suppose this example was actually an ON CONFLICT UPDATE query. If
I similarly make the aliases in the ON CONFLICT UPDATE
("target"/"excluded") visible in the returning list, it becomes
necessary to qualify every column - an ambiguity is introduced by
making both aliases visible, since any non-qualified column in the
RETURNING clause could be from either the "target" or "excluded"
alias/RTE. This is particularly annoying for the common, simple cases.
Also, when there was an update in respect of a any given slot, how, in
general, can I be sure that *any* visible excluded.* attribute is not
null (which you suggest as a reliable proxy for the update path having
been taken)? For one thing, the unique index that arbitrates whether
or not we take the "alternative path" is not restricting to covering
non-nullable attributes. So does the user end up specifying
system/hidden atrributes, just to make what you outline work? That
seems sort of messy.

--
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Anssi Kääriäinen
Date:
On Fri, 2014-12-05 at 00:21 -0800, Peter Geoghegan wrote:
> On Thu, Dec 4, 2014 at 10:27 PM, Anssi Kääriäinen
> <anssi.kaariainen@thl.fi> wrote:
> > For Django's use case this is a requirement. We must inform the user if
> > the save() action created a new row or if it modified an existing one.
> 
> Can you explain that in more detail, please?

Django has a post_save signal. The signal provide information of the
save operation. One piece of information is a created boolean flag. When
set to True the operation was an insert, on False it was an update.
See
https://docs.djangoproject.com/en/1.7/ref/signals/#django.db.models.signals.post_save
for details.

The created flag is typically used to perform some related action. An
example is User and UserProfile models. Each user must have an
UserProfile, so post_save can be used to create an empty userprofile on
creation of user.

If Django is going to use the INSERT ... ON CONFLICT UPDATE variant in
Django for the existing save() method, then it needs to know if the
result was an UPDATE or INSERT. If we are going to use this for other
operations (for example bulk merge of rows to the database), it would be
very convenient to have per-row updated/created information available so
that we can fire the post_save signals for the rows. If we don't have
that information available, it means we can't fire signals, and no
signals means we can't use the bulk merge operation internally as we
have to fire the signals where that happened before.

Outside of Django there are likely similar reasons to want to know if
the result of an operation was a creation of a new row. The reason could
be creation of related row, doing some action in application layer, or
just UI message telling "object created successfully" vs "object updated
successfully".
- Anssi




Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Robert Haas
Date:
On Thu, Dec 4, 2014 at 1:27 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Thu, Dec 4, 2014 at 3:04 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
>> Yes, I think that's pretty important. With a negative attno so it's
>> treated as a "hidden" col that must be explicitly named to be shown and
>> won't be confused with user columns.
>
> I think that the standard for adding a new system attribute ought to
> be enormous. The only case where a new one was added post-Postgres95
> was "tableoid". I'm pretty sure that others aren't going to want to do
> it that way.

+1.  System attributes are a mess; a negative attribute number implies
not only that the column is by default hidden from view but also that
it is stored in the tuple header rather than the tuple data.  Using
one here, where we're not even talking about a property of the tuple
per se, would be a pretty goofy solution.

> Besides, I'm not entirely convinced that this is actually
> an important distinction to expose.

I think it's probably an important distinction, for the kinds of
reasons Anssi mentions, but we should look for some method other than
a system column of indicating it.  Maybe there's a magic function that
returns a Boolean which you can call, or maybe some special clause, as
with WITH ORDINALITY.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Josh Berkus
Date:
On 12/05/2014 07:59 AM, Robert Haas wrote:
> I think it's probably an important distinction, for the kinds of
> reasons Anssi mentions, but we should look for some method other than
> a system column of indicating it.  Maybe there's a magic function that
> returns a Boolean which you can call, or maybe some special clause, as
> with WITH ORDINALITY.

I thought the point of INSERT ... ON CONFLICT update was so that you
didn't have to care if it was a new row or not?

If you do care, it seems like it makes more sense to do your own INSERTs
and UPDATEs, as Django currently does.

I wouldn't be *opposed* to having a pseudocolumn in the RETURNed stuff
which let me know updated|inserted|ignored, but I also don't see it as a
feature requirement for 9.5.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Fri, Dec 5, 2014 at 10:00 AM, Josh Berkus <josh@agliodbs.com> wrote:
> I thought the point of INSERT ... ON CONFLICT update was so that you
> didn't have to care if it was a new row or not?
>
> If you do care, it seems like it makes more sense to do your own INSERTs
> and UPDATEs, as Django currently does.
>
> I wouldn't be *opposed* to having a pseudocolumn in the RETURNed stuff
> which let me know updated|inserted|ignored, but I also don't see it as a
> feature requirement for 9.5.

Agreed. Importantly, we won't have painted ourselves into a corner
where we cannot add it later, now that RETURNING projects updates
tuples, too (V1.5 established that). I'm pretty confident that it
would be a mistake to try and make the inner "excluded" and "target"
aliases visible, in any case, because of the annoying ambiguity it
creates for the common, simple cases. I think it has to be a magic
function, or something like that.

I really hoped we'd be having a more technical, implementation level
discussion at this point. Simon rightly emphasized the importance of
getting the semantics right, and the importance of discussing that up
front, but I'm concerned;  that's almost all that has been discussed
here so far, which is surely not balanced either.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Robert Haas
Date:
On Fri, Dec 5, 2014 at 1:07 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Agreed. Importantly, we won't have painted ourselves into a corner
> where we cannot add it later, now that RETURNING projects updates
> tuples, too (V1.5 established that).

Yeah, it seems fine to postpone that to a later version, as long as we
haven't painted ourselves into a corner.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Fri, Dec 5, 2014 at 1:01 AM, Anssi Kääriäinen
<anssi.kaariainen@thl.fi> wrote:
> If Django is going to use the INSERT ... ON CONFLICT UPDATE variant in
> Django for the existing save() method, then it needs to know if the
> result was an UPDATE or INSERT. If we are going to use this for other
> operations (for example bulk merge of rows to the database), it would be
> very convenient to have per-row updated/created information available so
> that we can fire the post_save signals for the rows. If we don't have
> that information available, it means we can't fire signals, and no
> signals means we can't use the bulk merge operation internally as we
> have to fire the signals where that happened before.
>
> Outside of Django there are likely similar reasons to want to know if
> the result of an operation was a creation of a new row. The reason could
> be creation of related row, doing some action in application layer, or
> just UI message telling "object created successfully" vs "object updated
> successfully".

It probably isn't ideal, but you'd at least be able to do something
with row level triggers in the absence of a standard way of directly
telling if an insert or update was performed.

--
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Anssi Kääriäinen
Date:
On Fri, 2014-12-05 at 10:00 -0800, Josh Berkus wrote:
> I thought the point of INSERT ... ON CONFLICT update was so that you
> didn't have to care if it was a new row or not?
> 
> If you do care, it seems like it makes more sense to do your own INSERTs
> and UPDATEs, as Django currently does.

Django tries to update the object if it already exists in the database.
If it doesn't, then Django does an insert. This is suboptimal from
concurrency standpoint, and does two round trips to the database instead
of just one.

For Django, both insert and update are OK when saving an object to the
database, but Django needs to know which one was done.

I too agree that this doesn't need to be handled in the first version of
the patch.
- Anssi




Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
Attached revision, v1.6, slightly tweaks the ordering of per-statement
trigger execution. The ordering is now explicitly documented (the html
mirror has been updated:
http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/trigger-definition.html).

As always, there is a variant for each approach to value locking.

This revision fixes bitrot that developed when the patchset was
applied on master's tip, and also cleans up comments regarding how the
parent insert carries auxiliary/child state through all stages of
query processing. That should structure be clearer now, including how
setrefs.c has the auxiliary/child ModifyTable use the same
resultRelation as its parent.

--
Peter Geoghegan

Attachment

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Dec 8, 2014 at 8:16 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Attached revision, v1.6, slightly tweaks the ordering of per-statement
> trigger execution.

Right now, there is no way for a before row insert/update trigger to
determine whether it was called as part of an INSERT ... ON CONFLICT
UPDATE or not. It's also not possible for a DO INSTEAD trigger on a
view (a before row insert trigger) to determine that it was called
specifically due to an INSERT...IGNORE (which I think ought to imply
that any corresponding, "redirected" insertion into a table should
also use IGNORE....that's at least going to be something that a
certain number of apps will need to be made robust against).

The question is: Do we want to expose this distinction to triggers?
The natural way to do so would probably be to add TG_SPECULATIVE
special variable to plpgsql (and equivalent variables in other PLs).
This text variable would be either "UPSERT" or "IGNORE"; it would be
NULL when it was not applicable (e.g. with traditional INSERTs).

How do people feel about this? Is it important to include this in our
initial cut of the feature? I thought that I'd avoid that kind of
thing until prompted to address it by others, since it probably won't
end up being a common concern, but I'd like to hear a few opinions.
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Robert Haas
Date:
On Thu, Dec 11, 2014 at 1:11 AM, Peter Geoghegan <pg@heroku.com> wrote:
> On Mon, Dec 8, 2014 at 8:16 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> Attached revision, v1.6, slightly tweaks the ordering of per-statement
>> trigger execution.
>
> Right now, there is no way for a before row insert/update trigger to
> determine whether it was called as part of an INSERT ... ON CONFLICT
> UPDATE or not. It's also not possible for a DO INSTEAD trigger on a
> view (a before row insert trigger) to determine that it was called
> specifically due to an INSERT...IGNORE (which I think ought to imply
> that any corresponding, "redirected" insertion into a table should
> also use IGNORE....that's at least going to be something that a
> certain number of apps will need to be made robust against).
>
> The question is: Do we want to expose this distinction to triggers?
> The natural way to do so would probably be to add TG_SPECULATIVE
> special variable to plpgsql (and equivalent variables in other PLs).
> This text variable would be either "UPSERT" or "IGNORE"; it would be
> NULL when it was not applicable (e.g. with traditional INSERTs).
>
> How do people feel about this? Is it important to include this in our
> initial cut of the feature? I thought that I'd avoid that kind of
> thing until prompted to address it by others, since it probably won't
> end up being a common concern, but I'd like to hear a few opinions.

It's probably something we should add, but there's enough to do
getting the basic feature working first.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Michael Paquier
Date:
On Fri, Dec 12, 2014 at 6:39 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> It's probably something we should add, but there's enough to do
> getting the basic feature working first.
Moving this patch to CF 2014-12 as work is still going on.
-- 
Michael



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Jeff Janes
Date:
On Mon, Dec 8, 2014 at 8:16 PM, Peter Geoghegan <pg@heroku.com> wrote:
Attached revision, v1.6, slightly tweaks the ordering of per-statement
trigger execution. The ordering is now explicitly documented (the html
mirror has been updated:
http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/trigger-definition.html).

As always, there is a variant for each approach to value locking.

This revision fixes bitrot that developed when the patchset was
applied on master's tip, and also cleans up comments regarding how the
parent insert carries auxiliary/child state through all stages of
query processing. That should structure be clearer now, including how
setrefs.c has the auxiliary/child ModifyTable use the same
resultRelation as its parent.


If I build either option of the patch under MinGW, I get an error in the grammar files related to the IGNORE reserved word.

$ (./configure --host=x86_64-w64-mingw32 --without-zlib && make  && make check) > /dev/null

In file included from ../../../src/include/parser/gramparse.h:29:0,
                 from gram.y:59:
../../../src/include/parser/gram.h:207:6: error: expected identifier before numeric constant
In file included from gram.y:14366:0:

I don't get this problem on Linux.

The build chain seems to meet the specified minimum:

flex.exe 2.5.35
bison (GNU Bison) 2.4.2
This is perl, v5.8.8 built for msys-64int

It seems like IGNORE is getting replaced by the preprocessor with something else, but I don't know how to get my hands on the intermediate file after the preprocessor has done its thing.

Also, in both Linux and MinGW under option 1 patch I get an OID conflict on OID 3261.

Cheers,

Jeff
 

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Tom Lane
Date:
Jeff Janes <jeff.janes@gmail.com> writes:
> It seems like IGNORE is getting replaced by the preprocessor with something
> else, but I don't know how to get my hands on the intermediate file after
> the preprocessor has done its thing.

Maybe IGNORE is defined as a macro in MinGW?
Try s/IGNORE/IGNORE_P/g throughout the patch.
        regards, tom lane



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Dec 15, 2014 at 4:22 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> Also, in both Linux and MinGW under option 1 patch I get an OID conflict on
> OID 3261.

I'll take a pass at fixing this bitrot soon. I'll follow Tom's advice
about macro collisions on MinGW while I'm at it, since his explanation
seems plausible.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Dec 15, 2014 at 4:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jeff Janes <jeff.janes@gmail.com> writes:
>> It seems like IGNORE is getting replaced by the preprocessor with something
>> else, but I don't know how to get my hands on the intermediate file after
>> the preprocessor has done its thing.
>
> Maybe IGNORE is defined as a macro in MinGW?
> Try s/IGNORE/IGNORE_P/g throughout the patch.

BTW, the gcc -E flag does this. So figure out what exact arguments
MinGW's gcc is passed in the ordinary course of compiling gram.c, and
prepend "-E" to the list of existing flags while manually executing
gcc -- that should let you know exactly what's happening here.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Dec 15, 2014 at 4:59 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Mon, Dec 15, 2014 at 4:22 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>> Also, in both Linux and MinGW under option 1 patch I get an OID conflict on
>> OID 3261.
>
> I'll take a pass at fixing this bitrot soon. I'll follow Tom's advice
> about macro collisions on MinGW while I'm at it, since his explanation
> seems plausible.

Attached pair of revised patch sets fix the OID collision, and
presumably fix the MinGW issue (because IGNORE_P is now used as a
token name). It also polishes approach #2 to value locking in a few
places (e.g. better comments). Finally, both patches have a minor
buglet around EXPLAIN ANALYZE output fixed -- the output now indicates
if tuples are pulled up from auxiliary update nodes.

--
Peter Geoghegan

Attachment

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Jeff Janes
Date:

On Mon, Dec 15, 2014 at 5:05 PM, Peter Geoghegan <pg@heroku.com> wrote:
>
> Maybe IGNORE is defined as a macro in MinGW?
> Try s/IGNORE/IGNORE_P/g throughout the patch.

BTW, the gcc -E flag does this. So figure out what exact arguments
MinGW's gcc is passed in the ordinary course of compiling gram.c, and
prepend "-E" to the list of existing flags while manually executing
gcc -- that should let you know exactly what's happening here.

Yep, I tried that trick and had decided it didn't work in MinGW.  But I think it was a user error--I must have somehow broken up the build tree and 'make' didn't detect the problem.  Now I see that IGNORE is getting turned to 0.

Your new version 1.7 of the patches fixes that issue, as well as the OID conflict.

Thanks,

Jeff

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Tue, Dec 16, 2014 at 11:08 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> Your new version 1.7 of the patches fixes that issue, as well as the OID
> conflict.

Good.

You're probably aware that I maintain a stress testing suite for the
patch here: https://github.com/petergeoghegan/upsert

In the past, you've had a lot of success with coming up with stress
tests that find bugs. Maybe you can come up with some improvements to
the suite, if you'd care to test the patch. I can authorize your
Github account to push code to that repo, if you're interested.
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Gavin Flower
Date:
On 17/12/14 10:11, Peter Geoghegan wrote:
> On Tue, Dec 16, 2014 at 11:08 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
>> Your new version 1.7 of the patches fixes that issue, as well as the OID
>> conflict.
> Good.
>
> You're probably aware that I maintain a stress testing suite for the
> patch here: https://github.com/petergeoghegan/upsert
>
> In the past, you've had a lot of success with coming up with stress
> tests that find bugs. Maybe you can come up with some improvements to
> the suite, if you'd care to test the patch. I can authorize your
> Github account to push code to that repo, if you're interested.
Yeah!

I have just released a prototype software (not related to pg): I'm going 
to tell them to treat it with extreme suspicion, no matter how much they 
may respect the developer (me)!

Though like Pg, it is critical that it records data with reliability. 
Also, both need testing to try and detect intermittent errors (I already 
found one myself in the prototype - fortunately, not so critical it 
needs to be fixed in the prototype, but would have to be eliminated from 
the production version!).

So I think it really great to encourage people to come up with demanding 
tests, especially automated stress testing for pg.


Cheers,
Gavin

(Who wishes he had the time & experience to contribute to pg.)



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Heikki Linnakangas
Date:
It looks like we are close to reaching consensus on the syntax. Phew! 
Thanks for maintaining the wiki pages and the documentation. All of the 
below is based on those, I haven't looked at the patch itself yet.

The one thing that I still feel uneasy about is the Unique Index 
Inference thing. Per the syntax example from the wiki page, the UPSERT 
statement looks like this:

INSERT INTO upsert(key, val) VALUES(1, 'insert')   ON CONFLICT (key) IGNORE;

With ON CONFLICT IGNORE, the list of key columns can also be left out:

INSERT INTO upsert(key, val) VALUES(1, 'insert')   ON CONFLICT IGNORE;

The documentation says that:

> Omitting the specification indicates a total indifference to where
> any would-be uniqueness violation could occur, which isn't always
> appropriate; at times, it may be desirable for ON CONFLICT IGNORE to
> not suppress a duplicate violation within an index where that isn't
> explicitly anticipated. Note that ON CONFLICT UPDATE assignment may
> result in a uniqueness violation, just as with a conventional
> UPDATE.

Some questions:

1. Does that mean that if you leave out the key columns, the insertion 
is IGNOREd if it violates *any* unique key constraint?

2. If you do specify the key columns, then the IGNORE path is taken only 
if the insertion violates a unique key constraint on those particular 
columns. Otherwise an error is thrown. Right? Now, let's imagine a table 
like this:

CREATE TABLE persons (  username text unique,  real_name text unique,  data text
);

Is there any way to specify both of those constraints, so that the 
insertion is IGNOREd if it violates either one of them? If you try to do:

INSERT INTO persons(username, real_name, data)
VALUES('foobar', 'foo bar')
ON CONFLICT (username, real_name) IGNORE;

It will fail because there is no unique index on (username, real_name). 
In this particular case, you could leave out the specification, but if 
there was a third constraint that you're not expecting to conflict with, 
you would want violations of that constraint to still throw an error. 
And you can't leave out the specification with ON CONFLICT UPDATE anyway.

3. Why is the specification required with ON CONFLICT UPDATE, but not 
with ON CONFLICT IGNORE?

4. What happens if there are multiple unique indexes with identical 
columns, and you give those columns in the inference specification? 
Doesn't matter which index you use, I guess, if they're all identical, 
but see next question.

5. What if there are multiple unique indexes with the same columns, but 
different operator classes?

6. Why are partial unique indexes not supported as arbitrators?

- Heikki




Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Josh Berkus
Date:
On 12/17/2014 01:12 PM, Heikki Linnakangas wrote:
> 3. Why is the specification required with ON CONFLICT UPDATE, but not
> with ON CONFLICT IGNORE?

Well, UPDATE has to know which row to lock, no?  IGNORE does not.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Wed, Dec 17, 2014 at 1:12 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> It looks like we are close to reaching consensus on the syntax. Phew! Thanks
> for maintaining the wiki pages and the documentation. All of the below is
> based on those, I haven't looked at the patch itself yet.

Great, thanks!

Yes, I am relieved that we appeared to have agreed on a syntax.

> The one thing that I still feel uneasy about is the Unique Index Inference
> thing.

> The documentation says that:
>
>> Omitting the specification indicates a total indifference to where
>> any would-be uniqueness violation could occur, which isn't always
>> appropriate;

> Some questions:
>
> 1. Does that mean that if you leave out the key columns, the insertion is
> IGNOREd if it violates *any* unique key constraint?

Yes. This is particularly important for the implementation of things
like IGNORE's updatable view support. More generally, for various ETL
use cases it's possible to imagine the user simply not caring.

> 2. If you do specify the key columns, then the IGNORE path is taken only if
> the insertion violates a unique key constraint on those particular columns.
> Otherwise an error is thrown. Right?

That's right.

> Now, let's imagine a table like this:
>
> CREATE TABLE persons (
>   username text unique,
>   real_name text unique,
>   data text
> );
>
> Is there any way to specify both of those constraints, so that the insertion
> is IGNOREd if it violates either one of them? If you try to do:
>
> INSERT INTO persons(username, real_name, data)
> VALUES('foobar', 'foo bar')
> ON CONFLICT (username, real_name) IGNORE;
>
> It will fail because there is no unique index on (username, real_name). In
> this particular case, you could leave out the specification, but if there
> was a third constraint that you're not expecting to conflict with, you would
> want violations of that constraint to still throw an error. And you can't
> leave out the specification with ON CONFLICT UPDATE anyway.

Good point.

For the IGNORE case: I guess the syntax just isn't that flexible. I
agree that that isn't ideal.

For the UPDATE case: Suppose your example was an UPDATE where we
simply assigned the excluded.data value to the data column in the
auxiliary UPDATE's targetlist. What would the user really be asking
for with that command, at a really high level? It seems like they
might actually want to run two UPSERT commands (one for username, the
other for real_name), or rethink their indexing strategy - in
particular, whether it's appropriate that there isn't a composite
unique constraint on (username, real_name).

Now, suppose that by accident or by convention it will always be
possible for a composite unique index to be built on (username,
real_name) - no dup violations would be raised if it was attempted,
but it just hasn't been and won't be. In other words, it's generally
safe to actually pretend that there is one. Then, surely it doesn't
matter if the user picks one or the other unique index. It'll all work
out when the user assigns to both in the UPDATE targetlist, because of
the assumed convention that I think is implied by the example. If the
convention is violated, at least you get a dup violation letting you
know (iff you bothered to assign). But I wouldn't like to encourage
that pattern.

I think that the long and the short of it is that you really ought to
have one unique index as an arbiter in mind when writing a DML
statement for the UPDATE variant. Relying on this type of convention
is possible, I suppose, but ill-advised.

> 3. Why is the specification required with ON CONFLICT UPDATE, but not with
> ON CONFLICT IGNORE?

That was a fairly recent decision, taken mainly to keep Kevin happy --
although TBH I don't recall that he was particularly insistent on that
restriction. I could still go either way on that question.

The idea, as I mentioned, is that it's legitimate to not care where a
dup violation might occur for certain ETL use cases. For UPSERT,
though, the only argument for not making it mandatory is that that's
something extra to type, and lazy people would prefer not to bother.
This is because we assume that the first dup violation is the only
possible one without the unique index inference clause. If we don't
have the index expressions with which to infer an arbiter unique index
(as with MySQL's ON DUPLICATE KEY UPDATE), you'd better be sure that
you accounted for all possible sources of would-be duplicate
violations - otherwise a random row will be updated!

That isn't a fantastic argument for not making a unique index
inference clause mandatory, but it might be an okay one.

> 4. What happens if there are multiple unique indexes with identical columns,
> and you give those columns in the inference specification? Doesn't matter
> which index you use, I guess, if they're all identical, but see next
> question.

It doesn't really matter which one you pick, but right now, at the
urging of Robert, we cost the list of candidates and pick the cheapest
iff there is more than one [1] (and error if there are none). This is
roughly similar to costing of indexes for CLUSTER, and occurs during
optimization (only parse analysis of the expressions associated with
the unique index inference clause occurs during parse analysis -
indexes are looked up and matched in the optimizer).

> 5. What if there are multiple unique indexes with the same columns, but
> different operator classes?

I thought about that. I am reusing a little bit of the CREATE INDEX
infrastructure for raw parsing, and for a small amount of parse
analysis (conveniently, this makes the command reject things like
aggregate functions with no additional code - the error messages only
mention "index expressions", so I believe that's fine). This could
include an opclass specification, but right now non-default opclasses
are rejected during extra steps in parse analysis, for no particular
reason.

I could easily have the unique index inference specification accept a
named opclass, if you thought that was important, and you thought
naming a non-default opclass by name was a good SQL interface. It
would take only a little effort to support non-default opclasses.

> 6. Why are partial unique indexes not supported as arbitrators?

Robert and I discussed this quite a bit -- it was the argument for
being able to name a unique index by name (not that I'm very happy
with that idea or anything) [2]. Basically, dealing with the possible
behaviors with before row insert triggers might in general greatly
complicate the implementation, even though the issues we'd then be
protected against would seldom arise. Robert seemed to think that we
could revisit this in a future version [3].

Note that IGNORE will still IGNORE any partial unique index -- it just
won't accept one as the sole arbiter of whether or not the IGNORE path
should be taken (so it's really the inference specification syntax
that doesn't accept partial unique indexes, just as it doesn't accept
updatable views, exclusion constraints, and inheritance parents where
the semantics are similarly iffy -- that's both the reason for and the
mechanism by which ON CONFLICT UPDATE does not support these things).

[1] http://www.postgresql.org/message-id/CAM3SWZQz+jYkwfuZvcSf0qtpa2QiY+8NGNcHjfWgz3DDzRfzEg@mail.gmail.com
[2] http://www.postgresql.org/message-id/CAM3SWZQ8tDPdjiwj_FW4AO8gEvpyiixwBE67OVQuufPJ+y1e1g@mail.gmail.com
[3] http://www.postgresql.org/message-id/CA+TgmoZgLgY2PBAMTY3T1jpYXAvNL-w=T6o+6pMqrVR+Vn-iyg@mail.gmail.com
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Heikki Linnakangas
Date:
On 12/18/2014 01:02 AM, Peter Geoghegan wrote:
> On Wed, Dec 17, 2014 at 1:12 PM, Heikki Linnakangas
> <hlinnakangas@vmware.com> wrote:
>> Now, let's imagine a table like this:
>>
>> CREATE TABLE persons (
>>    username text unique,
>>    real_name text unique,
>>    data text
>> );
>>
>> Is there any way to specify both of those constraints, so that the insertion
>> is IGNOREd if it violates either one of them? If you try to do:
>>
>> INSERT INTO persons(username, real_name, data)
>> VALUES('foobar', 'foo bar')
>> ON CONFLICT (username, real_name) IGNORE;
>>
>> It will fail because there is no unique index on (username, real_name). In
>> this particular case, you could leave out the specification, but if there
>> was a third constraint that you're not expecting to conflict with, you would
>> want violations of that constraint to still throw an error. And you can't
>> leave out the specification with ON CONFLICT UPDATE anyway.
>
> Good point.
>
> For the IGNORE case: I guess the syntax just isn't that flexible. I
> agree that that isn't ideal.

It should be simple to allow multiple key specifications:

INSERT INTO persons (username, real_name, data)
VALUES('foobar', 'foo bar')
ON CONFLICT (username), (real_name) IGNORE;

It's a rather niche use case, but might as well support it for the sake 
of completeness.

> For the UPDATE case: Suppose your example was an UPDATE where we
> simply assigned the excluded.data value to the data column in the
> auxiliary UPDATE's targetlist. What would the user really be asking
> for with that command, at a really high level? It seems like they
> might actually want to run two UPSERT commands (one for username, the
> other for real_name), or rethink their indexing strategy - in
> particular, whether it's appropriate that there isn't a composite
> unique constraint on (username, real_name).
>
> Now, suppose that by accident or by convention it will always be
> possible for a composite unique index to be built on (username,
> real_name) - no dup violations would be raised if it was attempted,
> but it just hasn't been and won't be. In other words, it's generally
> safe to actually pretend that there is one. Then, surely it doesn't
> matter if the user picks one or the other unique index. It'll all work
> out when the user assigns to both in the UPDATE targetlist, because of
> the assumed convention that I think is implied by the example. If the
> convention is violated, at least you get a dup violation letting you
> know (iff you bothered to assign). But I wouldn't like to encourage
> that pattern.
>
> I think that the long and the short of it is that you really ought to
> have one unique index as an arbiter in mind when writing a DML
> statement for the UPDATE variant. Relying on this type of convention
> is possible, I suppose, but ill-advised.

Another thought is that you might want to specify a different action 
depending on which constraint is violated:

INSERT INTO persons (username, real_name, data)
VALUES('foobar', 'foo bar')
ON CONFLICT (username) IGNORE
ON CONFLICT (real_name) UPDATE ...;

Although that leaves the question of what to do if both are violated. 
Perhaps:

INSERT INTO persons (username, real_name, data)
VALUES('foobar', 'foo bar')
ON CONFLICT (username, real_name) IGNORE
ON CONFLICT (real_name) UPDATE username = excluded.username;
ON CONFLICT (username) UPDATE real_name = excluded.real_name;

>> 5. What if there are multiple unique indexes with the same columns, but
>> different operator classes?
>
> I thought about that. I am reusing a little bit of the CREATE INDEX
> infrastructure for raw parsing, and for a small amount of parse
> analysis (conveniently, this makes the command reject things like
> aggregate functions with no additional code - the error messages only
> mention "index expressions", so I believe that's fine). This could
> include an opclass specification, but right now non-default opclasses
> are rejected during extra steps in parse analysis, for no particular
> reason.
>
> I could easily have the unique index inference specification accept a
> named opclass, if you thought that was important, and you thought
> naming a non-default opclass by name was a good SQL interface. It
> would take only a little effort to support non-default opclasses.

It's a little weird to mention an opclass by name. It's similar to 
naming an index by name, really. How about naming the operator? For an 
exclusion constraint, that would be natural, as the syntax to create an 
exclusion constraint in the first place is "EXCLUDE USING gist (c WITH &&)"

Naming the index by columns makes sense in most cases, and I don't like 
specifying the index's name, but how about allowing naming a constraint? 
Indexes are just an implementation detail, but constraints are not. 
Unique and exclusion constraints are always backed by an index, so there 
is little difference in practice, but I would feel much more comfortable 
mentioning constraints by name than indexes.

Most people would list the columns, but if there is a really bizarre 
constraint, with non-default opclasses, or an exclusion constraint, it's 
probably been given a name that you could use.


In theory, with the promise tuple approach to locking, you don't 
necessarily even need an index to back up the constraint. You could just 
do a sequential scan of the whole table to see if there are any 
conflicting rows, then insert the row, and perform another scan to see 
if any conflicting rows appeared in the meantime. Performance would 
suck, and there is no guarantee that another backend doesn't do a 
regular INSERT into to the table that violates the imaginary constraint, 
so this is pretty useless in practice. So probably better to not allow it.

- Heikki




Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Kevin Grittner
Date:
Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

> INSERT INTO persons (username, real_name, data)
> VALUES('foobar', 'foo bar')
> ON CONFLICT (username), (real_name) IGNORE;

> INSERT INTO persons (username, real_name, data)
> VALUES('foobar', 'foo bar')
> ON CONFLICT (username) IGNORE
> ON CONFLICT (real_name) UPDATE ...;

> INSERT INTO persons (username, real_name, data)
> VALUES('foobar', 'foo bar')
> ON CONFLICT (username, real_name) IGNORE
> ON CONFLICT (real_name) UPDATE username = excluded.username;
> ON CONFLICT (username) UPDATE real_name = excluded.real_name;

I like all of these suggestions, except that I think they reflect a
couple things about the syntax which was never settled[1]. First,
Robert suggested using DUPLICATE instead of CONFLICT, which I think
it clearer. So the above would become:

INSERT INTO persons (username, real_name, data)
VALUES('foobar', 'foo bar')
ON DUPLICATE (username), (real_name) IGNORE;

INSERT INTO persons (username, real_name, data)
VALUES('foobar', 'foo bar')
ON DUPLICATE (username) IGNORE
ON DUPLICATE (real_name) UPDATE ...;

INSERT INTO persons (username, real_name, data)
VALUES('foobar', 'foo bar')
ON DUPLICATE (username, real_name) IGNORE
ON DUPLICATE (real_name) UPDATE username = excluded.username;
ON DUPLICATE (username) UPDATE real_name = excluded.real_name;

Second, he suggested a shorthand way of specifying that all the
values from the failed INSERT should be used for the UPDATE:

INSERT INTO persons (username, real_name, data)
VALUES('foobar', 'foo bar', 'baz')
ON DUPLICATE (username) UPDATE;

I think the first point got lost in the discussion of the second
one.

I don't think either point was ever really settled beyond Robert
and I preferring ON DUPLICATE versus Peter preferring ON CONFLICT.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] http://www.postgresql.org/message-id/CA+TgmoZN=2AJKi1n4Jz5BkmYi8r_CPUDW+DtoppmTeLVmsOoqw@mail.gmail.com



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Heikki Linnakangas
Date:
On 12/18/2014 05:46 PM, Kevin Grittner wrote:
> I don't think either point was ever really settled beyond Robert
> and I preferring ON DUPLICATE versus Peter preferring ON CONFLICT.

I also prefer ON CONFLICT, because that makes more sense when you 
consider exclusion constraints, which I'm still hoping that this would 
support. If not immediately, at least in the future.

- Heikki




Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Kevin Grittner
Date:
Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
> On 12/18/2014 05:46 PM, Kevin Grittner wrote:
>> I don't think either point was ever really settled beyond Robert
>> and I preferring ON DUPLICATE versus Peter preferring ON CONFLICT.
>
> I also prefer ON CONFLICT, because that makes more sense when you
> consider exclusion constraints, which I'm still hoping that this would
> support. If not immediately, at least in the future.

If you think this can be made to work without a UNIQUE btree index,
that is a persuasive point in favor of ON CONFLICT.  I had missed
(or forgotten) that we thought this could work without a UNIQUE
btree index as the basis of detecting when to resort to an UPDATE.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Jeff Janes
Date:
On Mon, Dec 15, 2014 at 11:06 PM, Peter Geoghegan <pg@heroku.com> wrote:
On Mon, Dec 15, 2014 at 4:59 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Mon, Dec 15, 2014 at 4:22 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>> Also, in both Linux and MinGW under option 1 patch I get an OID conflict on
>> OID 3261.
>
> I'll take a pass at fixing this bitrot soon. I'll follow Tom's advice
> about macro collisions on MinGW while I'm at it, since his explanation
> seems plausible.

Attached pair of revised patch sets fix the OID collision, and
presumably fix the MinGW issue (because IGNORE_P is now used as a
token name). It also polishes approach #2 to value locking in a few
places (e.g. better comments). Finally, both patches have a minor
buglet around EXPLAIN ANALYZE output fixed -- the output now indicates
if tuples are pulled up from auxiliary update nodes.

I naively tried this in vallock1 patch:

create table foo(index int, count int);

create unique index on foo(index);

insert into foo (index, count) values (0,1) on conflict (index) update set count=foo.count + 1 returning foo.count;

insert into foo (index, count) values (0,1) on conflict (index) update set count=foo.count + 1 returning foo.count;

insert into foo (index, count) values (0,1) on conflict (index) update set count=foo.count + 1 returning foo.count;

insert into foo (index, count) values (0,1) on conflict (index) update set count=foo.count + 1 returning foo.count;


After actually reading the documentation more closely, I decided this should be an error because "foo" is not a valid table alias in the "update set" expression.  Instead of being a parsing/planning error, this executes and the foo.count on the RHS of the assignment always evaluates as zero (even on subsequent invocations when TARGET.count is 1).

If I switch to a text type, then I get seg faults under the same condition:

create table foo(index int, count text);
create unique index on foo(index);
insert into foo (index, count) values (0,'start ') on conflict (index) update set count=foo.count||' bar' returning count;
insert into foo (index, count) values (0,'start ') on conflict (index) update set count=foo.count||' bar' returning count;
<boom>

#0  pg_detoast_datum_packed (datum=0x0) at fmgr.c:2270
#1  0x000000000074fb7a in textcat (fcinfo=0x1e67a78) at varlena.c:662
#2  0x00000000005a63a5 in ExecMakeFunctionResultNoSets (fcache=0x1e67a08, econtext=0x1e67848, isNull=0x1e68b11 "", isDone=<value optimized out>)
    at execQual.c:2026
#3  0x00000000005a2353 in ExecTargetList (projInfo=<value optimized out>, isDone=0x7fffa7fa346c) at execQual.c:5358
#4  ExecProject (projInfo=<value optimized out>, isDone=0x7fffa7fa346c) at execQual.c:5573
#5  0x00000000005a86c2 in ExecScan (node=0x1e67738, accessMtd=0x5baf00 <SeqNext>, recheckMtd=0x5bad60 <SeqRecheck>) at execScan.c:207
#6  0x00000000005a1918 in ExecProcNode (node=0x1e67738) at execProcnode.c:406
#7  0x000000000059ef32 in EvalPlanQualNext (epqstate=<value optimized out>) at execMain.c:2380
#8  0x00000000005b8fcd in ExecLockUpdateTuple (node=0x1e5f750) at nodeModifyTable.c:1098
#9  ExecInsert (node=0x1e5f750) at nodeModifyTable.c:372
#10 ExecModifyTable (node=0x1e5f750) at nodeModifyTable.c:1396
#11 0x00000000005a1958 in ExecProcNode (node=0x1e5f750) at execProcnode.c:383
#12 0x00000000005a0642 in ExecutePlan (queryDesc=0x1dd0908, direction=<value optimized out>, count=0) at execMain.c:1515
#13 standard_ExecutorRun (queryDesc=0x1dd0908, direction=<value optimized out>, count=0) at execMain.c:308
#14 0x00007f601416b9cb in pgss_ExecutorRun (queryDesc=0x1dd0908, direction=ForwardScanDirection, count=0) at pg_stat_statements.c:874
#15 0x000000000069385f in ProcessQuery (plan=0x1e47df0,
....

So I think there needs to be some kind of logic to de-recognize the table alias "foo".

Once I rewrote the query to use TARGET and EXCLUDED correctly, I've put this through an adaptation of my usual torture test, and it ran fine until wraparound shutdown.  I'll poke at it more later.

Cheers,

Jeff

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Dec 18, 2014 at 9:20 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> After actually reading the documentation more closely, I decided this should
> be an error because "foo" is not a valid table alias in the "update set"
> expression.  Instead of being a parsing/planning error, this executes and
> the foo.count on the RHS of the assignment always evaluates as zero (even on
> subsequent invocations when TARGET.count is 1).
>
> If I switch to a text type, then I get seg faults under the same condition:

> So I think there needs to be some kind of logic to de-recognize the table
> alias "foo".
>
> Once I rewrote the query to use TARGET and EXCLUDED correctly, I've put this
> through an adaptation of my usual torture test, and it ran fine until
> wraparound shutdown.  I'll poke at it more later.

Oops. I agree with your diagnosis, and will circle around to fix that
bug in the next revision by, as you say, simply rejecting the query if
it doesn't use the two standard aliases.

Thanks for testing!
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Dec 18, 2014 at 7:51 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> On 12/18/2014 05:46 PM, Kevin Grittner wrote:
>>
>> I don't think either point was ever really settled beyond Robert
>> and I preferring ON DUPLICATE versus Peter preferring ON CONFLICT.
>
>
> I also prefer ON CONFLICT, because that makes more sense when you consider
> exclusion constraints, which I'm still hoping that this would support. If
> not immediately, at least in the future.

This was why I changed the spelling to ON CONFLICT. It also doesn't
hurt that that spelling is dissimilar to MySQL's syntax, IMV, because
there are plenty of things to dislike about ON DUPLICATE KEY UPDATE,
and I think a veneer of compatibility is inappropriate - this syntax
is both considerably more flexible and considerably safer.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Dec 18, 2014 at 6:59 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
>> Good point.
>>
>> For the IGNORE case: I guess the syntax just isn't that flexible. I
>> agree that that isn't ideal.
>
>
> It should be simple to allow multiple key specifications:
>
> INSERT INTO persons (username, real_name, data)
> VALUES('foobar', 'foo bar')
> ON CONFLICT (username), (real_name) IGNORE;
>
> It's a rather niche use case, but might as well support it for the sake of
> completeness.

I guess that wouldn't be very hard to implement, and perhaps we should
do so soon. I am reluctant to let scope creep too far, though. As you
mentioned, this is a niche use case.

>> I think that the long and the short of it is that you really ought to
>> have one unique index as an arbiter in mind when writing a DML
>> statement for the UPDATE variant. Relying on this type of convention
>> is possible, I suppose, but ill-advised.
>
> Another thought is that you might want to specify a different action
> depending on which constraint is violated:
>
> INSERT INTO persons (username, real_name, data)
> VALUES('foobar', 'foo bar')
> ON CONFLICT (username) IGNORE
> ON CONFLICT (real_name) UPDATE ...;
>
> Although that leaves the question of what to do if both are violated.
> Perhaps:
>
> INSERT INTO persons (username, real_name, data)
> VALUES('foobar', 'foo bar')
> ON CONFLICT (username, real_name) IGNORE
> ON CONFLICT (real_name) UPDATE username = excluded.username;
> ON CONFLICT (username) UPDATE real_name = excluded.real_name;

I think that there might be a place for that, but I'd particularly
like to avoid figuring this out now - this suggestion is a complicated
new direction for the patch, and it's not as if adding this kind of
flexibility is precluded by not allowing it in the first version - we
won't paint ourselves into a corner by not doing this up front. The
patch is already complicated enough! Users can always have multiple
UPSERT commands, and that might be very close to good enough for a
relatively rare use case like this.

>> I could easily have the unique index inference specification accept a
>> named opclass, if you thought that was important, and you thought
>> naming a non-default opclass by name was a good SQL interface. It
>> would take only a little effort to support non-default opclasses.
>
> It's a little weird to mention an opclass by name. It's similar to naming an
> index by name, really. How about naming the operator? For an exclusion
> constraint, that would be natural, as the syntax to create an exclusion
> constraint in the first place is "EXCLUDE USING gist (c WITH &&)"
>
> Naming the index by columns makes sense in most cases, and I don't like
> specifying the index's name, but how about allowing naming a constraint?
> Indexes are just an implementation detail, but constraints are not. Unique
> and exclusion constraints are always backed by an index, so there is little
> difference in practice, but I would feel much more comfortable mentioning
> constraints by name than indexes.

The main reason for naming a constraint by name in practice will
probably be because there is no better way to deal with partial unique
indexes (which can be quite useful). But partial unique indexes aren't
formally constraints, in that they don't have pg_constraint entries.
So I don't think that that's going to be acceptable, entirely for that
reason. :-(

> Most people would list the columns, but if there is a really bizarre
> constraint, with non-default opclasses, or an exclusion constraint, it's
> probably been given a name that you could use.

What I find curious about the opclass thing is: when do you ever have
an opclass that has a different idea of equality than the default
opclass for the type? In other words, when is B-Tree strategy number 3
not actually '=' in practice, for *any* B-Tree opclass? Certainly, it
doesn't appear to be the case that it isn't so with any shipped
opclasses - the shipped non-default B-Tree opclasses only serve to
provide alternative notions of sort order, and never "equals".

I think that with B-Tree (which is particularly relevant for the
UPDATE variant), it ought to be defined to work with the type's
default opclass "equals" operator, just like GROUP BY and DISTINCT.
Non-default opclass unique indexes work just as well in practice,
unless someone somewhere happens to create an oddball one that doesn't
use '=' as its "equals" operator (while also having '=' as the default
opclass "equals" operator). I am not aware that that leaves any
actually shipped opclass out (and I include our external extension
ecosystem here, although I might be wrong about that part).

> In theory, with the promise tuple approach to locking, you don't necessarily
> even need an index to back up the constraint.

> So probably better to not allow it.

I agree that we definitely want to require that there is an
appropriate index available.

I think we can live without support for partial unique indexes for the
time being. With non-default opclasses effectively handled (by caring
about the "equals" operator only, and acceptable non-default opclass
indexes when that happens to match the default's), and by assuming
that having an INSERT ... ON CONFLICT IGNORE without an inference
specification to find an exclusion constraint is enough, we have
acceptable semantics, IMV. The worst part of that is that partial
unique indexes cannot be used with the ON CONFLICT UPDATE variant, in
my opinion, but Rome wasn't built in a day.

It would be nice to have a way of discriminating against particular
indexes (unique constraint-related, partial unique, or otherwise) for
the IGNORE variant, but I fear that that'll be difficult to figure out
in time. There is no need to address those questions in the first
version, since I don't think we're failing to play nice with another
major feature. We already have something much more flexible than
equivalent features in other major systems here.
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Martijn van Oosterhout
Date:
On Fri, Dec 19, 2014 at 05:32:43PM -0800, Peter Geoghegan wrote:
> > Most people would list the columns, but if there is a really bizarre
> > constraint, with non-default opclasses, or an exclusion constraint, it's
> > probably been given a name that you could use.
>
> What I find curious about the opclass thing is: when do you ever have
> an opclass that has a different idea of equality than the default
> opclass for the type? In other words, when is B-Tree strategy number 3
> not actually '=' in practice, for *any* B-Tree opclass? Certainly, it
> doesn't appear to be the case that it isn't so with any shipped
> opclasses - the shipped non-default B-Tree opclasses only serve to
> provide alternative notions of sort order, and never "equals".

Well, in theory you could build a case insensetive index on a text
column. You could argue that the column should have been defined as
citext in the first place, but it might not for various reasons.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.  -- Arthur Schopenhauer

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Sat, Dec 20, 2014 at 2:16 AM, Martijn van Oosterhout
<kleptog@svana.org> wrote:
>> What I find curious about the opclass thing is: when do you ever have
>> an opclass that has a different idea of equality than the default
>> opclass for the type? In other words, when is B-Tree strategy number 3
>> not actually '=' in practice, for *any* B-Tree opclass? Certainly, it
>> doesn't appear to be the case that it isn't so with any shipped
>> opclasses - the shipped non-default B-Tree opclasses only serve to
>> provide alternative notions of sort order, and never "equals".
>
> Well, in theory you could build a case insensetive index on a text
> column. You could argue that the column should have been defined as
> citext in the first place, but it might not for various reasons.

That generally works in other systems by having a case-insensitive
collation. I don't know if that implies that non bitwise identical
items can be equal according to the "equals" operator in those other
systems. There aren't too many examples of that happening in general
(I can only think of citext and numeric offhand), presumably because
it necessitates a normalization process (such as lower-casing in the
case of citext) within the hash opclass support function 1, a process
best avoided.

citext is an interesting precedent that supports my argument above,
because citext demonstrates that we preferred to create a new type
rather than a new non-default opclass (with a non-'=' "equals"
operator) when time came to introduce a new concept of "equals" (and
not merely a new, alternative sort order). Again, this is surely due
to the system dependency on the default B-Tree opclass for the
purposes of GROUP BY and DISTINCT, whose behavior sort ordering
doesn't necessarily enter into at all.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Dec 18, 2014 at 9:31 AM, Peter Geoghegan <pg@heroku.com> wrote:
>> So I think there needs to be some kind of logic to de-recognize the table
>> alias "foo".
>>
>> Once I rewrote the query to use TARGET and EXCLUDED correctly, I've put this
>> through an adaptation of my usual torture test, and it ran fine until
>> wraparound shutdown.  I'll poke at it more later.
>
> Oops. I agree with your diagnosis, and will circle around to fix that
> bug in the next revision

Attached patch fixes the bug. I'm not delighted about the idea of
cutting off parent parse state (the parse state of the insert) within
transformUpdateStmt() only once we've used the parent state to
establish that this is a "speculative"/auxiliary update, but it's
probably the path of least resistance here.

When this is rolled into the next version, there will be a testcase.

Thanks
--
Peter Geoghegan

Attachment

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Michael Paquier
Date:
On Sun, Dec 21, 2014 at 6:56 AM, Peter Geoghegan <pg@heroku.com> wrote:
> On Thu, Dec 18, 2014 at 9:31 AM, Peter Geoghegan <pg@heroku.com> wrote:
>>> So I think there needs to be some kind of logic to de-recognize the table
>>> alias "foo".
>>>
>>> Once I rewrote the query to use TARGET and EXCLUDED correctly, I've put this
>>> through an adaptation of my usual torture test, and it ran fine until
>>> wraparound shutdown.  I'll poke at it more later.
>>
>> Oops. I agree with your diagnosis, and will circle around to fix that
>> bug in the next revision
>
> Attached patch fixes the bug. I'm not delighted about the idea of
> cutting off parent parse state (the parse state of the insert) within
> transformUpdateStmt() only once we've used the parent state to
> establish that this is a "speculative"/auxiliary update, but it's
> probably the path of least resistance here.
>
> When this is rolled into the next version, there will be a testcase.
Looking at this thread, the last version of this patch is available here:
http://www.postgresql.org/message-id/CAM3SWZRvkCKc=1Y6_Wn8mk97_Vi8+j-aX-RY-=msrJVU-Ec-qw@mail.gmail.com
And they do not apply correctly, so this patch needs a rebase.
Regards,
-- 
Michael



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Sun, Dec 21, 2014 at 6:10 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> Looking at this thread, the last version of this patch is available here:
> http://www.postgresql.org/message-id/CAM3SWZRvkCKc=1Y6_Wn8mk97_Vi8+j-aX-RY-=msrJVU-Ec-qw@mail.gmail.com
> And they do not apply correctly, so this patch needs a rebase.

That isn't so. The latest version is much more recent than that. It's
available here:

http://www.postgresql.org/message-id/CAM3SWZQTqsCLZ1YJ1OuWFpO-GmFHwtgwTOg+o_NNzxrPa7Cx4A@mail.gmail.com

Everything is tracked in the commitfest app in detail.
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Michael Paquier
Date:
On Mon, Dec 22, 2014 at 11:20 AM, Peter Geoghegan <pg@heroku.com> wrote:
> On Sun, Dec 21, 2014 at 6:10 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> Looking at this thread, the last version of this patch is available here:
>> http://www.postgresql.org/message-id/CAM3SWZRvkCKc=1Y6_Wn8mk97_Vi8+j-aX-RY-=msrJVU-Ec-qw@mail.gmail.com
>> And they do not apply correctly, so this patch needs a rebase.
>
> That isn't so. The latest version is much more recent than that. It's
> available here:
>
> http://www.postgresql.org/message-id/CAM3SWZQTqsCLZ1YJ1OuWFpO-GmFHwtgwTOg+o_NNzxrPa7Cx4A@mail.gmail.com
>
> Everything is tracked in the commitfest app in detail.
Oops, sorry. I got mistaken because of the name of the latest attachments.
-- 
Michael



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Heikki Linnakangas
Date:
On 12/20/2014 11:14 PM, Peter Geoghegan wrote:
> On Sat, Dec 20, 2014 at 2:16 AM, Martijn van Oosterhout
> <kleptog@svana.org> wrote:
>>> What I find curious about the opclass thing is: when do you ever have
>>> an opclass that has a different idea of equality than the default
>>> opclass for the type? In other words, when is B-Tree strategy number 3
>>> not actually '=' in practice, for *any* B-Tree opclass? Certainly, it
>>> doesn't appear to be the case that it isn't so with any shipped
>>> opclasses - the shipped non-default B-Tree opclasses only serve to
>>> provide alternative notions of sort order, and never "equals".
>>
>> Well, in theory you could build a case insensetive index on a text
>> column. You could argue that the column should have been defined as
>> citext in the first place, but it might not for various reasons.
>
> That generally works in other systems by having a case-insensitive
> collation. I don't know if that implies that non bitwise identical
> items can be equal according to the "equals" operator in those other
> systems. There aren't too many examples of that happening in general
> (I can only think of citext and numeric offhand), presumably because
> it necessitates a normalization process (such as lower-casing in the
> case of citext) within the hash opclass support function 1, a process
> best avoided.
>
> citext is an interesting precedent that supports my argument above,
> because citext demonstrates that we preferred to create a new type
> rather than a new non-default opclass (with a non-'=' "equals"
> operator) when time came to introduce a new concept of "equals" (and
> not merely a new, alternative sort order). Again, this is surely due
> to the system dependency on the default B-Tree opclass for the
> purposes of GROUP BY and DISTINCT, whose behavior sort ordering
> doesn't necessarily enter into at all.

Yeah, I don't expect it to happen very often. It's confusing to have 
multiple definitions of equality.

There is one built-in example: the "record *= record" operator [1]. It's 
quite special purpose, the docs even say that they "are not intended to 
be generally useful for writing queries". But there they are.

I feel that it needs to be possible to specify the constraint 
unambiguously in all cases. These are very rare use cases, but we should 
have an escape hatch for the rare cases that need it.


What would it take to also support partial indexes?

[1] See 
http://www.postgresql.org/docs/devel/static/functions-comparisons.html#ROW-WISE-COMPARISON

- Heikki




Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Dec 22, 2014 at 1:24 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> I feel that it needs to be possible to specify the constraint unambiguously
> in all cases. These are very rare use cases, but we should have an escape
> hatch for the rare cases that need it.
>
>
> What would it take to also support partial indexes?

Aside from considerations about how to pick them without using their
name, partial unique indexes aren't special at all. My earlier concern
was that we'd need to account for before row insert triggers that
change values out from under us. But maybe that concern was overblown,
come to think of it. I am already borrowing a little bit of the raw
parser's logic for CREATE INDEX statements for unique index inference
(during parse analysis) -- we're matching the cataloged index
definition attributes/expressions, so this makes a lot of sense. Maybe
I had the wrong idea about partial indexes earlier, which was that we
must use the values in the tuple proposed for insertion to check that
a partial index was a suitable arbiter of whether or not the UPDATE
path should be taken in respect of any given tuple. I should just go
further with borrowing things from CREATE INDEX, and give the user an
optional way of specifying a WHERE clause that is also matched in a
similar way to the expressions themselves. Did the partial unique
index your UPSERT implied not cover the ultimate tuple inserted after
before row insert triggers fired? That's on you as a user...you'll
always get an insert, since there won't be a would-be duplicate
violation to make there be an update.

I actually care about partial unique indexes a lot. They're a very
useful feature. Back when I was an application developer, I frequently
used "is_active" boolean columns to represent "logical app-level
deletion", where actually deleting the tuple was not possible (e.g.
because it may still be referenced in historic records), while not
wanting to have it be subject to uniqueness checks as a logically
deleted/!is_active tuple.

This measure to support partial indexes, plus the additional leeway
around non-default opclass unique indexes that I can add (that they
need only match the "equals" operator of the default opclass to be
accepted) brings us 99.9% of the way. That only leaves:

* An inability to specifying some subset of unique indexes or
exclusion constraints for the IGNORE variant (the UPDATE variant is
irrelevant).

* An inability to specifying a IGNORE arbitrating *exclusion
constraint* as the sole arbiter of whether or not the IGNORE path
should be taken. (exclusion constraints are not usable for the UPDATE
variant, so that's irrelevant again).

Did I forget something?

The use cases around these limitations are very rare, and only apply
to the IGNORE variant which seems much less interesting. I'm quite
comfortable dealing with them in a later release of PostgreSQL, to cut
scope (or avoid adding scope) for 9.5. Do you think that's okay? How
often will the IGNORE variant be used when everything shouldn't be
IGNOREd anyway? Although, to be totally fair, I should probably also
include:

* non-default B-tree opclasses cannot be specified as arbiters of the
alternative path (for both IGNORE and UPDATE variants) iff their
"equals" operator happens to not be the "equals" operator of the
default opclass (which is theoretical, and likely non-existent as a
use case).

If you're dead set on having an escape hatch, maybe we should just get
over it and add a way of specifying a unique index by name. As I said,
these under-served use cases are either exceedingly rare or entirely
theoretical.
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Robert Haas
Date:
On Mon, Dec 22, 2014 at 5:04 PM, Peter Geoghegan <pg@heroku.com> wrote:
> If you're dead set on having an escape hatch, maybe we should just get
> over it and add a way of specifying a unique index by name. As I said,
> these under-served use cases are either exceedingly rare or entirely
> theoretical.

I'm decidedly unenthusiastic about that.  People don't expect CREATE
INDEX CONCURRENTLY + DROP INDEX CONCURRENTLY to break their DML.  I
think the solution in this case would be a gateway to problems larger
than the one we're trying to solve.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Tue, Dec 23, 2014 at 5:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Dec 22, 2014 at 5:04 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> If you're dead set on having an escape hatch, maybe we should just get
>> over it and add a way of specifying a unique index by name. As I said,
>> these under-served use cases are either exceedingly rare or entirely
>> theoretical.
>
> I'm decidedly unenthusiastic about that.  People don't expect CREATE
> INDEX CONCURRENTLY + DROP INDEX CONCURRENTLY to break their DML.  I
> think the solution in this case would be a gateway to problems larger
> than the one we're trying to solve.

I tend to agree. I think we should just live with the fact that not
every conceivable use case will be covered, at least initially. Then,
if an appreciable demand for even more flexibility emerges, we can
revisit this. We already have a syntax that is significantly more
flexible than the equivalent feature in any other system. Let's not
lose sight of that.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Tue, Dec 23, 2014 at 11:30 AM, Peter Geoghegan <pg@heroku.com> wrote:
> I tend to agree. I think we should just live with the fact that not
> every conceivable use case will be covered, at least initially.

To be clear: I still think I should go and make the changes that will
make the feature play nice with all shipped non-default B-Tree
operator classes, and will make it work with partial unique indexes
[1]. That isn't difficult or controversial, AFAICT, and gets us very
close to satisfying every conceivable use case.

[1] http://www.postgresql.org/message-id/CAM3SWZQdv7GDLwPRv7=rE-gG1QjLOOL3vCmAriCBcTYk8GwqKw@mail.gmail.com
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Stephen Frost
Date:
* Peter Geoghegan (pg@heroku.com) wrote:
> On Tue, Dec 23, 2014 at 5:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> > On Mon, Dec 22, 2014 at 5:04 PM, Peter Geoghegan <pg@heroku.com> wrote:
> >> If you're dead set on having an escape hatch, maybe we should just get
> >> over it and add a way of specifying a unique index by name. As I said,
> >> these under-served use cases are either exceedingly rare or entirely
> >> theoretical.
> >
> > I'm decidedly unenthusiastic about that.  People don't expect CREATE
> > INDEX CONCURRENTLY + DROP INDEX CONCURRENTLY to break their DML.  I
> > think the solution in this case would be a gateway to problems larger
> > than the one we're trying to solve.
>
> I tend to agree. I think we should just live with the fact that not
> every conceivable use case will be covered, at least initially. Then,
> if an appreciable demand for even more flexibility emerges, we can
> revisit this. We already have a syntax that is significantly more
> flexible than the equivalent feature in any other system. Let's not
> lose sight of that.

+1
Thanks,
    Stephen

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Thu, Dec 18, 2014 at 9:20 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> I've put this through an adaptation of my usual torture test, and it ran
> fine until wraparound shutdown.  I'll poke at it more later.

Could you elaborate, please? What are the details of the torture test
you're performing?

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Fri, Dec 19, 2014 at 5:32 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> Most people would list the columns, but if there is a really bizarre
>> constraint, with non-default opclasses, or an exclusion constraint, it's
>> probably been given a name that you could use.
>
> What I find curious about the opclass thing is: when do you ever have
> an opclass that has a different idea of equality than the default
> opclass for the type? In other words, when is B-Tree strategy number 3
> not actually '=' in practice, for *any* B-Tree opclass? Certainly, it
> doesn't appear to be the case that it isn't so with any shipped
> opclasses - the shipped non-default B-Tree opclasses only serve to
> provide alternative notions of sort order, and never "equals".
>
> I think that with B-Tree (which is particularly relevant for the
> UPDATE variant), it ought to be defined to work with the type's
> default opclass "equals" operator, just like GROUP BY and DISTINCT.
> Non-default opclass unique indexes work just as well in practice,
> unless someone somewhere happens to create an oddball one that doesn't
> use '=' as its "equals" operator (while also having '=' as the default
> opclass "equals" operator). I am not aware that that leaves any
> actually shipped opclass out (and I include our external extension
> ecosystem here, although I might be wrong about that part).

So looking at the way the system deals with its dependence on default
operator classes, I have a hard time justifying all this extra
overhead for the common case. The optimizer will refuse to use an
index with a non-default opclass even when AFAICT there is no *real*
semantic dependence on anything other than the "equals" operator,
which seems to always match across a type's opclasses anyway. e.g.,
DISTINCT will only use a non-default opclass B-Tree index, even though
in practice the "equals" operator always matches for shipped
non-default opclasses; DISTINCT will not work with a text_pattern_ops
index, while it will work with a default text B-Tree opclass index,
*even though no corresponding "ORDER BY" was given*.

Someone recently pointed out in a dedicated thread that the system
isn't all that bright about exploiting the fact that group aggregates
don't necessarily need to care about facets of sort ordering like
collations, which have additional overhead [1]. That might be a useful
special case to target (to make underlying sorts faster), but the big
picture is that the system doesn't know when it only needs to care
about an "equals" operator matching some particular
B-Tree-opclass-defined notion of sorting, rather than caring about a
variety of operators matching. Sometimes, having a matching "equals"
operator of some non-default opclass is good enough to make an index
(or sort scheme) of that opclass usable for some purpose that only
involves equality, and not sort order (like DISTINCT, with no ORDER
BY, executed using a GroupAggregate, for example).

I thought we should formalize the idea that a non-default opclass must
have the same notion of equality (the same "equals" operator) as its
corresponding default opclass, if any. That way, presumably the
optimizer has license to be clever about only caring about
"DISTINCTness"/equality. That also gives my implementation license to
not care about which operator class a unique index uses -- it must not
matter.

Heikki pointed out that there is one shipped opclass that has an
"equals" operator that happens to not be spelt "=" [2] (and
furthermore, does not match that of the default opclass). That's the
record_image_ops opclass, which unusually has an "equals" operator of
"*=". So as Heikki pointed out, it looks like there is some limited
precedent for having to worry about B-Tree opclasses that introduce
alternative notions of "equals", rather than merely alternative
notions of sort order. So so much for formalizing that all of a type's
B-Tree opclass "equals" operators must match...

...having thought about it for a while more, though, I think we should
*still* ignore opclass for the purposes of unique index inference. The
implementation doesn't care about the fact that you used a non-default
opclass. Sure, in theory that could lead to inconsistencies, if there
was multiple unique indexes of multiple opclasses that just so
happened to have incompatible ideas about equality, but that seems
ludicrous...we have only one extremely narrow example of how that
could happen. Plus there'd have to be *both* unique indexes defined
and available for us to infer as appropriate, before the inference
logic could accidentally infer the wrong idea of equality. That seems
like an extremely implausible scenario. Even if we allow for the idea
that alternative notions of equality are something that will happen in
the wild, obviously the user cares about the definition of equality
that they actually used for the unique index in question.

We can document that unique index inference doesn't care about
opclasses (recall that I still only plan on letting users infer a
B-Tree unique index), which is thought to almost certainly not matter.
I think that ought to be fine. In the next revision of UPSERT, the
implementation formally won't care about the opclass of an index when
inferring a unique index to use as an arbiter of whether to take the
alternative IGNORE/UPDATE path. That's formally left undefined.

As already discussed before, I will still proceed with allowing the
user to pick a partial unique index when writing a unique index
inference specification.

[1] http://www.postgresql.org/message-id/CAFjtmHU3Obf5aSpWY7i18diapvjg-418hYySdqUuYhXZtjChhg@mail.gmail.com
[2] http://www.postgresql.org/message-id/54988BF5.9000405@vmware.com
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Jeff Janes
Date:

On Tue, Dec 23, 2014 at 11:55 AM, Peter Geoghegan <pg@heroku.com> wrote:
On Thu, Dec 18, 2014 at 9:20 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> I've put this through an adaptation of my usual torture test, and it ran
> fine until wraparound shutdown.  I'll poke at it more later.

Could you elaborate, please? What are the details of the torture test
you're performing?

I've uploaded it here.


The gist of it is that I increment a count column of a random row (via pk) in multiple connections simultaneously.

When the server crashes, or it gets to a certain number of increments, the threads report their activity up to the parent, which then waits for automatic recovery and compares the state of the database to the reported state of the children threads.

That is for my original code.  For this purpose, I made the count go either up or down randomly, and when a row's count passes through zero it gets deleted.  Then when it is chosen for increment/decrement again, it has to be inserted.  I've made this happen either through a update-or-insert-or-retry loop (two variants) or by using your new syntax.

There is a patch which adds a simulation for a torn-page-write followed by a crash, and also adds some elogs that I've sometimes found useful for tracking down problems, with new GUCs to control them.

I don't think you made changes to the WAL/recovery routines, so I don't expect crashing recovery to be a big hazard for your patch, but I wanted to run a test where I was generally familiar with the framework, and thought an independently derived test might exercise some new aspects.

The one thing I noticed is that using your syntax starts out slightly slower than the retry loop, but then gets much slower (down by 2 or 3 times) after a while.  It might be a vacuuming issue.  The constant intentional crashes interferes with good vacuuming behavior, and I need to retest this with the intentional crashes turned off to see if that fixes it.  I'm having difficult access to my usual testing hardware over the holidays, so I'm not getting as much done as I hoped.

I'll try to look at your own stress tests on github as well.

Cheers,

Jeff

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Sat, Dec 27, 2014 at 11:48 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
>> Could you elaborate, please? What are the details of the torture test
>> you're performing?

> The gist of it is that I increment a count column of a random row (via pk)
> in multiple connections simultaneously.

This is great. In general, I strongly believe that we should be doing
this kind of thing more formally and more frequently. Thanks!

> That is for my original code.  For this purpose, I made the count go either
> up or down randomly, and when a row's count passes through zero it gets
> deleted.  Then when it is chosen for increment/decrement again, it has to be
> inserted.  I've made this happen either through a update-or-insert-or-retry
> loop (two variants) or by using your new syntax.

Did you continue to limit your investigation to value locking approach
#1? I think that #2 is the more likely candidate for commit, that we
should focus on. However, #1 is more "conceptually pure", and is
therefore an interesting basis of comparison with #2 when doing this
kind of testing.

> There is a patch which adds a simulation for a torn-page-write followed by a
> crash, and also adds some elogs that I've sometimes found useful for
> tracking down problems, with new GUCs to control them.

Cool.

> I don't think you made changes to the WAL/recovery routines, so I don't
> expect crashing recovery to be a big hazard for your patch, but I wanted to
> run a test where I was generally familiar with the framework, and thought an
> independently derived test might exercise some new aspects.

Value locking approach #2 does touch crash recovery. Value locking
approach #1 does not.

I certainly see the logic in starting with independently derived
tests. We all have our blind spots.

> The one thing I noticed is that using your syntax starts out slightly slower
> than the retry loop, but then gets much slower (down by 2 or 3 times) after
> a while.  It might be a vacuuming issue.

Interesting. I'd like to compare both approaches to value locking here.

> I'll try to look at your own stress tests on github as well.

Would you be opposed to merging your custom stress-test suite into my
git repo? I'll give you the ability to push to it.

I can help you out if you think you'd benefit from access to my
Quad-core server (Intel Core i7-4770) for stress-testing. I'll
coordinate with you about it privately.
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Fri, Dec 26, 2014 at 4:22 PM, Peter Geoghegan <pg@heroku.com> wrote:
> So looking at the way the system deals with its dependence on default
> operator classes, I have a hard time justifying all this extra
> overhead for the common case.

Attached pair of revised patch sets, V1.8:

* Explicitly leaves undefined what happens when a non-default opclass
index *with an alternative notion of not just sort order, but
equality* exists. In practice it depends on the available unique
indexes. I really found it impossible to justify imposing any
restriction here, given the total lack of a scenario in which this
even *could* matter, let alone will. This is a minor wart, but I think
it's acceptable.

* Allows "unique index inference specification" clause to have a WHERE
clause (this is distinct from the WHERE clause that might also appear
in the UPDATE auxiliary query). This can be used to infer partial
unique indexes. I really didn't want to give up support for partial
indexes with the UPDATE variant (recall that the UPDATE variant
*requires* an inference clause), since partial unique indexes are
particularly useful.

Note that the unique index must actually cover the tuple at insert
time, or an error is raised. An example of this that appears in the
regression tests is:

insert into insertconflicttest values (23, 'Uncovered by Index') on
conflict (key where fruit like '%berry') ignore;
ERROR:  partial arbiter unique index has predicate that does not cover
tuple proposed for insertion
DETAIL:  ON CONFLICT inference clause implies that the tuple proposed
for insertion actually be covered by partial predicate for index
"partial_key_index".
HINT:  ON CONFLICT inference clause must infer a unique index that
covers the final tuple, after BEFORE ROW INSERT triggers fire.

* New documentation reflecting the above. A couple of paragraphs in
the INSERT SQL reference page now covers these topics.

* Fix Jeff Jane's bug by added sanitizing code [1]. Certain illegal
queries now correctly rejected during parse analysis.

* Fixed another tiny buglet in EXPLAIN ANALYZE output with a RETURNING
clause, by making sure auxiliary query plan from update also has
plan-level targetlist set.

* Minor clean-up to code comments here and there (in particular, for
the ExcludedExpr primnode used to implement the EXCLUDED.*
pseudo-alias thing).

* Better serialization failure error messages.

I recommend looking at my mirror of the modified documentation:
http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html
to get up to speed on how unique index inference specification clause
have been extended to support partial unique indexes. As I mentioned,
apart from that, the INSERT SQL reference page now covers the
definition of a "CONFLICT"/opclass semantics issues.

I really hope that this deals with all semantics/syntax related loose
ends, allowing discussion of this patch to take a more low-level
focus, which is what is really needed. I feel that further
improvements may be possible, and that the syntax can be even more
flexible, but it's already flexible enough for our first iteration of
this feature. Importantly, we have something that is enormously more
flexible than any equivalent feature in any other system, which
includes the flexibility to extend the syntax in various other
directions (e.g. specifying particular exclusion constraints).

[1] http://archives.postgresql.org/message-id/CAM3SWZT=HptrGyihZiyT39sPBhp+CXOTW=MhNFzXiLf-Jh4QVA@mail.gmail.com
--
Peter Geoghegan

Attachment

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Jeff Janes
Date:
On Sun, Dec 28, 2014 at 3:19 PM, Peter Geoghegan <pg@heroku.com> wrote:
On Fri, Dec 26, 2014 at 4:22 PM, Peter Geoghegan <pg@heroku.com> wrote:
> So looking at the way the system deals with its dependence on default
> operator classes, I have a hard time justifying all this extra
> overhead for the common case.

Attached pair of revised patch sets, V1.8:

Hi Peter,

Using the vallock2 version of V1.8, using the test I previously described, I get some all-null rows, which my code should never create.  Also, the index and table don't agree, in this example I find 3 all-null rows in the table, but only 2 in the index.  I've attached an example output of querying via index and via full table scan, and also the pageinspect output of the blocks which have the 3 rows, in case that is helpful.

This was just a straight forward issue of firing queries at the database, the crash-inducing part of my test harness was not active during this test.  I also ran it with my crashing patch reversed out, in case I introduced the problem myself, and it still occurs.

Using V1.7 of the vallock2 patch, I saw the same thing with some all-null rows.  I also saw some other issues where two rows with the same key value would be present twice in the table (violating the unique constraint) but only one of them would appear in the index.  I suspect it is caused by the same issue as the all-null rows, and maybe I just didn't run v1.8 enough times to find that particular manifestation under v1.8.

Cheers,

Jeff

Attachment

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
Hi Jeff,

On Mon, Dec 29, 2014 at 2:29 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> Using the vallock2 version of V1.8, using the test I previously described, I
> get some all-null rows, which my code should never create.  Also, the index
> and table don't agree, in this example I find 3 all-null rows in the table,
> but only 2 in the index.  I've attached an example output of querying via
> index and via full table scan, and also the pageinspect output of the blocks
> which have the 3 rows, in case that is helpful.

Interesting. Thanks a lot for your help!

> This was just a straight forward issue of firing queries at the database,
> the crash-inducing part of my test harness was not active during this test.
> I also ran it with my crashing patch reversed out, in case I introduced the
> problem myself, and it still occurs.
>
> Using V1.7 of the vallock2 patch, I saw the same thing with some all-null
> rows.  I also saw some other issues where two rows with the same key value
> would be present twice in the table (violating the unique constraint) but
> only one of them would appear in the index.  I suspect it is caused by the
> same issue as the all-null rows, and maybe I just didn't run v1.8 enough
> times to find that particular manifestation under v1.8.

This is almost certainly a latent bug with approach #2 to value
locking, that has probably been there all along. Semantics and syntax
have been a recent focus, and so the probability that I introduced a
regression of this nature in any recent revision seems low. I am going
to investigate the problem, and hope to have a diagnosis soon.

Once again, thanks!
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Dec 29, 2014 at 2:29 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> I've attached an example output of querying via index and via full table
> scan, and also the pageinspect output of the blocks which have the 3 rows,
> in case that is helpful.

You might have also included output from pageinspect's bt_page_items()
function. Take a look at the documentation patch I just posted if the
details are unclear.

Thanks
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Dec 29, 2014 at 2:29 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> Using the vallock2 version of V1.8, using the test I previously described, I
> get some all-null rows, which my code should never create.  Also, the index
> and table don't agree, in this example I find 3 all-null rows in the table,
> but only 2 in the index.

Just to be clear: You haven't found any such issue with approach #1 to
value locking, right?

I'm curious about how long it took you to see the issue with #2. Were
there any special steps? What were the exact steps involved in turning
off the hard crash mechanism you mention? It looks like the condition
you describe ought to be highlighted by the script automatically. Is
that right? (I don't know any Perl and the script isn't really
documented at a high level).

Thanks
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Jeff Janes
Date:
On Mon, Dec 29, 2014 at 9:12 PM, Peter Geoghegan <pg@heroku.com> wrote:
On Mon, Dec 29, 2014 at 2:29 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> Using the vallock2 version of V1.8, using the test I previously described, I
> get some all-null rows, which my code should never create.  Also, the index
> and table don't agree, in this example I find 3 all-null rows in the table,
> but only 2 in the index.

Just to be clear: You haven't found any such issue with approach #1 to
value locking, right?

Correct, I haven't seen any problems with approach #1
 

I'm curious about how long it took you to see the issue with #2. Were
there any special steps? What were the exact steps involved in turning
off the hard crash mechanism you mention?

Generally the problem will occur early on in the process, and if not then it will not occur at all.  I think that is because the table starts out empty, and so a lot of insertions collide with each other.  Once the table is more thoroughly populated, most query takes the CONFLICT branch and therefore two insertion-branches are unlikely to collide.

At its simplest, I just use the count_upsert.pl script and your patch and forget all the rest of the stuff from my test platform.

So:
 
pg_ctl stop -D /tmp/data2; rm /tmp/data2 -r; 
../torn_bisect/bin/pg_ctl initdb -D /tmp/data2; 
../torn_bisect/bin/pg_ctl start -D /tmp/data2 -o "--fsync=off" -w ; 
createdb;
perl count_upsert.pl 8 100000

A run of count_upsert.pl 8 100000 takes about 30 seconds on my machine (8 core), and if it doesn't create a problem then I just destroy the database and start over.

The fsync=off is not important, I've seen the problem once without it.  I just include it because otherwise the run takes a lot longer.

I've attached another version of the count_upsert.pl script, with some more logging targeted to this particular issue. 

The problem shows up like this:

init done at count_upsert.pl line 97.
sum is 1036
count is 9720
seq scan doesn't match index scan  1535 == 1535 and 1 == 6 $VAR1 = [
          [
            6535,
            -21
          ],
.....
(Thousands of more lines, as it outputs the entire table twice, once gathered by seq scan, once by bitmap index scan).

The first three lines are normal, the problem starts with the "seq scan doesn't match"...

In this case the first problem it ran into was that key 1535 was present once with a count column of 1 (found by seq scan) and once with a count column of 6 (found by index scan).  It was also in the seq scan with a count of 6, but the way the comparison works is that it sorts each representation of the table by the key column value and then stops at the first difference, in this case count columns 1 == 6 failed the assertion.

If you get some all-NULL rows, then you will also get Perl warnings issued when the RETURNING clause starts returning NULL when none are expected to be.

The overall pattern seems to be pretty streaky.  It could go 20 iterations with no problem, and then it will fail several times in a row.  I've seen this pattern quite a bit with other race conditions as well, I think that they may be sensitive to how memory gets laid out between CPUs, and that might depend on some longer-term characteristic of the state of the machine that survives an initdb.

By the way, I also got a new error message a few times that I think might be a manifestation of the same thing:

ERROR:  duplicate key value violates unique constraint "foo_index_idx"
DETAIL:  Key (index)=(6106) already exists.
STATEMENT:  insert into foo (index, count) values ($2,$1) on conflict (index)
                      update set count=TARGET.count + EXCLUDED.count returning foo.count

Cheers,

Jeff

Attachment

Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From
Peter Geoghegan
Date:
On Mon, Dec 29, 2014 at 11:52 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> Correct, I haven't seen any problems with approach #1

That helps with debugging #2, then. That's very helpful.

> Generally the problem will occur early on in the process, and if not then it
> will not occur at all.  I think that is because the table starts out empty,
> and so a lot of insertions collide with each other.  Once the table is more
> thoroughly populated, most query takes the CONFLICT branch and therefore two
> insertion-branches are unlikely to collide.
>
> At its simplest, I just use the count_upsert.pl script and your patch and
> forget all the rest of the stuff from my test platform.

I can reproduce this on my laptop now. I think that building at -O2
and without assertions helps. I'm starting to work through debugging
it.

I threw together a quick script for getting pg_xlogdump into a
Postgres table (a nice use of the new pg_lsn type). It's here:

https://github.com/petergeoghegan/jjanes_upsert/blob/master/pg_xlogdump2csv.py

It tells a story. Looking at the last segment before shutdown when the
problem occurred, I see:

postgres=# select count(*), tx from my_xlogdump group by tx having
count(*) > 4 order by 1;count |   tx
-------+---------    5 | 1917836    5 | 1902576    5 | 1909746    5 | 1901586    5 | 1916971    6 | 1870077   39 |
1918004 119 | 1918003 2246 |       0
 
(9 rows)

postgres=# select max(tx::text::int4) from my_xlogdump ;  max
---------1918004
(1 row)

So the last two transactions (1918003 and 1918004) get into some kind
of live-lock situation, it looks like. Or at least something that
causes them to produce significant more WAL records than other xacts
due to some kind of persistent problem with conflicts.

Here is where the earlier of the two problematic transactions has its
first record:

postgres=# select * from my_xlogdump where tx = '1918003' order by
r_lsn asc limit 1;rmgr | len_rec | len_tot |   tx    |   r_lsn    |  prev_lsn  |              descr
------+---------+---------+---------+------------+------------+----------------------------------------------------Heap
|      3 |     203 | 1918003 | 0/1783BB70 | 0/1783BB48 | INSERT
 
off 33 blkref #0: rel 1663/16471/12502 blk
(1 row)

After and including that record, until the trouble spot up to and
including shutdown, here is the rmgr breakdown:

postgres=# select count(*),  rmgr from my_xlogdump where r_lsn >=
'0/1783BB70' group by rmgr order by 1;count |    rmgr
-------+-------------    1 | XLOG         -- 1 CHECKPOINT_SHUTDOWN record    2 | Transaction -- commit records for the
twoxacts   20 | Heap2        -- all are CLEAN remxid records, tx is 0   76 | Heap          -- All from our two xacts...
 80 | Btree          -- All from XID 1918003 only
 
(5 rows)

So looks like a bad interaction with VACUUM. Maybe it's a problem with
VACUUM interlocking. That was my first suspicion, FWIW.

I'll need to do more investigating, but I can provide a custom format
dump of the table, in case anyone wants to look at what I have here in
detail. I've uploaded it to:

http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/files/my_xlogdump.custom.dump.table
-- 
Peter Geoghegan