Re: [HACKERS] MERGE SQL Statement for PG11 - Mailing list pgsql-hackers

From Nico Williams
Subject Re: [HACKERS] MERGE SQL Statement for PG11
Date
Msg-id 20171102203223.GT4496@localhost
Whole thread Raw
In response to Re: [HACKERS] MERGE SQL Statement for PG11  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Thu, Nov 02, 2017 at 12:51:45PM -0700, Peter Geoghegan wrote:
> Nico Williams <nico@cryptonector.com> wrote:
> >If you want to ignore conflicts arising from concurrency you could
> >always add an ON CONFLICT DO NOTHING to the INSERT DML in the mapping I
> >proposed earlier.  Thus a MERGE CONCURRENTLY could just do that.
> >
> >Is there any reason not to map MERGE as I proposed?
> 
> Performance, for one. MERGE generally has a join that can be optimized
> like an UPDATE FROM join.

Ah, right, I think my mapping was pessimal.  How about this mapping
instead then:

WITH   updated AS (     UPDATE <target>     SET ...     WHERE <condition>     RETURNING <target> ) , inserted AS (
INSERTINTO <target>     SELECT ...     WHERE <key> NOT IN (SELECT <key> FROM updated) AND ..     /*      * Add ON
CONFLICTDO NOTHING here to avoid conflicts in the face      * of concurrency.      */     RETURNING <target> )
 
DELETE FROM <target>
WHERE <key> NOT IN (SELECT <key> FROM updated) AND     <key> NOT IN (SELECT <key> FROM inserted) AND ...;

?

If a MERGE has no delete clause, then the mapping would be:

WITH   updated AS (     UPDATE <target>     SET ...     WHERE <condition>     RETURNING <target> )
INSERT INTO <target>
SELECT ...
WHERE <key> NOT IN (SELECT <key> FROM updated) AND ..
/** Add ON CONFLICT DO NOTHING here to avoid conflicts in the face* of concurrency.*/
;

> I haven't studied this question in any detail, but FWIW I think that
> using CTEs for merging is morally equivalent to a traditional MERGE
> implementation. [...]

I agree.  So why not do that initially?  Optimize later.

Such a MERGE mapping could be implemented entirely within
src/backend/parser/gram.y ...

Talk about cheap to implement, review, and maintain!

Also, this would be notionally very simple.

Any optimizations to CTE query/DML execution would be generic and
applicable to MERGE and other things besides.  If mapping MERGE to
CTE-using DMLs motivates such optimizations, all the better.

>          [...]. It may actually be possible to map from CTEs to a MERGE
> statement, but I don't think that that's a good approach to implementing
> MERGE.

Surely not every DML with CTEs can map to MERGE.  Maybe I misunderstood
your comment?

> Most of the implementation time will probably be spent doing things like
> making sure MERGE behaves appropriately with triggers, RLS, updatable
> views, and so on. That will take quite a while, but isn't particularly
> technically challenging IMV.

Note that mapping to a DML with CTEs as above gets triggers, RLS, and
updateable views right from the get-go, because DMLs with CTEs, and DMLs
as CTEs, surely do as well.

Nico
-- 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: [HACKERS] [PATCH] Add ALWAYS DEFERRED option for constraints
Next
From: Simon Riggs
Date:
Subject: Re: [HACKERS] MERGE SQL Statement for PG11