Home > mailing lists

Re: [HACKERS] MERGE SQL Statement for PG11 - Mailing list pgsql-hackers

From	Nico Williams
Subject	Re: [HACKERS] MERGE SQL Statement for PG11
Date	November 8, 2017 02:29:22
Msg-id	20171107232920.GZ4496@localhost Whole thread Raw
In response to	Re: [HACKERS] MERGE SQL Statement for PG11 (Peter Geoghegan <pg@bowt.ie>)
Responses	Re: [HACKERS] MERGE SQL Statement for PG11
List	pgsql-hackers

Tree view

On Thu, Nov 02, 2017 at 03:25:48PM -0700, Peter Geoghegan wrote:
> Nico Williams <nico@cryptonector.com> wrote:
> >A MERGE mapped to a DML like this:

I needed to spend more time reading MERGE docs from other RDBMSes.

The best MERGE so far is MS SQL Server's, which looks like:
 MERGE INTO <target> <target_alias> USING <source> <source_alias> ON (<join condition>)  -- optional: WHEN MATCHED THEN
UPDATESET ...

 -- optional: WHEN NOT MATCHED [ BY TARGET ] THEN INSERT ...
 -- optional: WHEN NOT MATCHED BY SOURCE THEN DELETE
 -- optional: OUTPUT ...  ;

(The other MERGEs are harder to use because they lack a WHEN NOT MATCHED
BY SOURCE THEN DELETE, instead having a DELETE clause on the UPDATE,
which is then difficult to use.)

This is *trivial* to map to a CTE, and, in fact, I and my colleagues
have resorted to hand-coded CTEs like this precisely because PG lacks
MERGE (though we ourselves didn't know about MERGE -- it's new to us).

If <source> is a query, then we start with a CTE for that, else if it's
a view or table, then we don't setup a CTE for it.  Each of the UPDATE,
INSERT, and/or DELETE can be it's own CTE.  If there's an OUTPUT clause,
that can be a final SELECT that queries from the CTEs that ran the DMLs
with RETURNING.  If there's no OUTPUT then none of the DMLs need to have
RETURNING, and one of them will be the main statement, rather than a
CTE.

The pattern is:
 WITH   -- IFF <source> is a query:   <source_alias> AS (<source>),
   -- IFF there's a WHEN MATCHED THEN UPDATE   updates AS (     UPDATE <target> AS <target_alias> SET ...     FROM
<source>    WHERE <join_condition>     -- IFF there's an OUTPUT clause, then:     RETURNING 'update' as "@action", ...
),

   inserts AS (     INSERT INTO <target> (<column_list>)     SELECT ...     FROM <source>     LEFT JOIN <target> ON
<join_condition>    WHERE <target> IS NOT DISTINCT FROM NULL     -- IFF there's a CONCURRENTLY clause:     ON CONFLICT
DONOTHING     -- IFF there's an OUTPUT clause, then:     RETURNING 'insert' as "@action", ...   ),

   deletes AS (     DELETE FROM <target>     WHERE NOT EXISTS (SELECT * FROM <source> WHERE <join_condition>)     --
IFFthere's an OUTPUT clause, then:     RETURNING 'delete' as "@action", ...   ),

 -- IFF there's an OUTPUT clause SELECT * FROM updates UNION SELECT * FROM inserts UNION SELECT * FROM deletes;

If there's not an output clause then one of the DMLs has to be the main
statement:
 WITH ... DELETE ...; -- or UPDATE, or INSERT

Note that if the source is a view or table and there's no OUTPUT clause,
then it's one DML with up to (but not referring to) two CTEs, and in all
cases the CTEs do not refer to each other.  This means that the executor
can parallelize all of the DMLs.

If the source is a query, then that could be made a temp view to avoid
having to run the query first.  The CTE executor needs to learn to
sometimes do this anyways, so this is good.

The <deletes> CTE can be equivalently written without a NOT EXISTS:
   to_be_deleted AS (     SELECT <target>     FROM <target>     LEFT JOIN <source> ON (<join_condition>)     WHERE
<source>IS NOT DISTINCT FROM NULL   ),   deletes AS (     DELETE FROM <target>     USING to_be_deleted tbd     WHERE
<target>= <tbd>   )

if that were to run faster (probably not, since PG today would first run
the to_be_deleted CTE, then the deletes CTE).  I mention only because
it's nice to see the symmetry of LEFT JOINs for the two WHEN NOT MATCHED
cases.

(Here <source> is the alias for it if one was given.)

***

This mapping triggers triggers as one would expect (at least FOR EACH
ROW; I expect the DMLs in CTEs should also trigger FOR EACH STATEMENT
triggers, and if they don't I consider that a bug).

> This is a bad idea. An implementation like this is not at all
> maintainable.

I beg to differ.  First of all, not having to add an executor for MERGE
is a win: much, much less code to maintain.  The code to map MERGE to
CTEs can easily be contained entirely in src/backend/parser/gram.y,
which is a maintainability win: any changes to how CTEs are compiled
will fail to compile if they break the MERGE mapping to CTEs.

> >can handle concurrency via ON CONFLICT DO NOTHING in the INSERT CTE.
> 
> That's not handling concurrency -- it's silently ignoring an error. Who
> is to say that the conflict that IGNORE ignored is associated with a row
> visible to the MVCC snapshot of the statement? IOW, why should the DELETE
> affect any row?

That was me misunderstanding MERGE.  The DELETE is independent of the
INSERT -- if an INSERT does nothing because of an ON CONFLICT DO
NOTHING clause, then that won't cause that row to be deleted -- the
inserts and deletes CTEs are independent in the latest mapping (see
above).

I believe adding ON CONFLICT DO NOTHING to the INSERT in this mapping is
all that's needed to support concurrency.

> There are probably a great many reasons why you need a ModifyTable
> executor node that keeps around state, and explicitly indicates that a
> MERGE is a MERGE. For example, we'll probably want statement level
> triggers to execute in a fixed order, regardless of the MERGE, RLS will
> probably require explicitly knowledge of MERGE semantics, and so on.

Let's take those examples one at a time:
- Is there a reason to believe that MERGE could not parallelize the  DMLs it implies?
  If they can be parallelized, then we should not define the order in  which the corresponding triggers fire.
  Surely we want to leave that possibility (parallelization) open,  rather than exclude it.
  The user should not depend on the order in which the FOR EACH  STATEMENT and FOR EACH ROW triggers will fire.  They
canalways check  at the end of the transaction with DEFERRED triggers (see also my  patch for ALWAYS DEFERRED
constraintsand triggers).

  AFTER <op> FOR EACH STATEMENT triggers will only run after all the  corresponding DMLs in the mapping have completed,
buttheir relative  orders should still not be defined.

- I don't see how RLS isn't entirely orthogonal.
  RLS would (does) apply as normal to all of the DMLs in the mapping.
  If that was not the case, then there'd be a serious bug in PG right  now!  Using a CTE must *not* disable RLS.
  FOR UPDATE RLS policies are broken, however, since they don't get to  see the OLD and NEW values.  But that's
orthogonalhere.

> FWIW, your example doesn't actually have a source (just a target), so it
> isn't actually like MERGE.

That was my mistake -- as I say above, I had to spend more time with the
various RDBMSes' MERGE docs.

Nico
-- 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

From: Michael Paquier
Date: 08 November 2017, 02:21:49
Subject: Re: [HACKERS] Exclude pg_internal.init from base backup

From: Nico Williams
Date: 08 November 2017, 02:30:03
Subject: Re: [HACKERS] [PATCH] Add ALWAYS DEFERRED option for constraints

Re: [HACKERS] MERGE SQL Statement for PG11 - Mailing list pgsql-hackers

Previous

Next