Re: [HACKERS] MERGE SQL Statement for PG11 - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: [HACKERS] MERGE SQL Statement for PG11
Date
Msg-id CANP8+j+FPEqpd8O3scwOKUgRvtWB5r5RhcCuJ8nn6VU_wc2Nng@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] MERGE SQL Statement for PG11  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: [HACKERS] MERGE SQL Statement for PG11
List pgsql-hackers
On 28 October 2017 at 00:31, Michael Paquier <michael.paquier@gmail.com> wrote:
> On Fri, Oct 27, 2017 at 9:00 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Fri, Oct 27, 2017 at 4:41 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> I didn't say it but my intention was to just throw an ERROR if no
>>> single unique index can be identified.
>>>
>>> It could be possible to still run MERGE in that situaton but we would
>>> need to take a full table lock at ShareRowExclusive. It's quite likely
>>> that such statements would throw duplicate update errors, so I
>>> wouldn't be aiming to do anything with that for PG11.
>>
>> Like Peter, I think taking such a strong lock for a DML statement
>> doesn't sound like a very desirable way forward.  It means, for
>> example, that you can only have one MERGE in progress on a table at
>> the same time, which is quite limiting.  It could easily be the case
>> that you have multiple MERGE statements running at once but they touch
>> disjoint groups of rows and therefore everything works.  I think the
>> code should be able to cope with concurrent changes, if nothing else
>> by throwing an ERROR, and then if the user wants to ensure that
>> doesn't happen by taking ShareRowExclusiveLock they can do that via an
>> explicit LOCK TABLE statement -- or else they can prevent concurrency
>> by any other means they see fit.
>
> +1, I would suspect users to run this query in parallel of the same
> table for multiple data sets.
>
> Peter has taken some time to explain me a bit his arguments today, and
> I agree that it does not sound much appealing to have constraint
> limitations for MERGE. Particularly using the existing ON CONFLICT
> structure gets the feeling of having twice a grammar for what's
> basically the same feature, with pretty much the same restrictions.
>
> By the way, this page sums up nicely the situation about many
> implementations of UPSERT taken for all systems:
> https://en.wikipedia.org/wiki/Merge_(SQL)

That Wikipedia article is badly out of date and regrettably does NOT
sum up the current situation nicely any more since MERGE has changed
in definition in SQL:2011 since its introduction in SQL:2003.

I'm proposing a MERGE statement for PG11 that
i) takes a RowExclusiveLock on rows, so can be run concurrently
ii) uses the ON CONFLICT infrastructure to do that
and so requires a unique constraint.

The above is useful behaviour that will be of great benefit to
PostgreSQL users. There are no anomalies remaining.

SQL:2011 specifically states "The extent to which an
SQL-implementation may disallow independent changes that are not
significant is implementation-defined”, so in my reading the above
behaviour would make us fully spec compliant. Thank you to Peter for
providing the infrastructure on which this is now possible for PG11.

Serge puts this very nicely by identifying two different use cases for MERGE.

Now, I accept that you might also want a MERGE statement that
continues to work even if there is no unique constraint, but it would
need to have different properties to the above. I do not in any way
argue against adding that. I also agree that adding RETURNING at a
later stage would be fine as well. I am proposing that those and any
other additional properties people come up with can be added in later
releases once we have the main functionality in core in PG11.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Fix typo in blvacuum.c
Next
From: Chris Travers
Date:
Subject: [HACKERS] WIP: Restricting pg_rewind to data/wal dirs