Re: MERGE Specification - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: MERGE Specification
Date
Msg-id 1209065299.4259.1641.camel@ebony.site
Whole thread Raw
In response to Re: MERGE Specification  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: MERGE Specification  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
On Thu, 2008-04-24 at 12:19 -0400, Tom Lane wrote:
> Decibel! <decibel@decibel.org> writes:
> > That really strikes me as taking the "MySQL route". If push comes to  
> > shove, I'll take a MERGE with race conditions over no merge at all,  
> > but I think it's very important that it does the right thing. Just  
> > because the spec doesn't say anything about it doesn't mean it's ok.
> 
> Agreed.  It seems to me that in the last set of discussions, we rejected
> implementing MERGE precisely because it failed to provide a solution to
> the race-condition problem.  I'm not satisfied with a version that
> doesn't handle that, because I think that is *exactly* what most people
> will try to use it for.  The non-concurrent bulk update case that Simon
> is arguing for is the uncommon usage.

If y'all think that, then I will do it that way.

The only protection from the race condition is to do the Insert first. 

With MERGE, we would need to do it like this:

1. If there are any WHEN NOT MATCHED clauses that trigger INSERTs, then
attempt to apply them first, no matter what order they were in with
respect to the WHEN MATCHED clauses. Start loop at step (3) every time.
If there aren't any, start loop straight at step (5). Note that we would
need to check to see that INSERTs had not been removed by Rules.

2. For each row retrieved by outer join, goto either step 3 or 5 as
established before the loop starts.

3. Try to apply the WHEN NOT MATCHED clauses. The ordering of the
clauses with respect to each other will remain exactly as stated. If one
of the clauses activates an INSERT, we start an internal subtransaction
and perform the INSERT action. If it succeeds, we commit the
subtransaction and continue.

4. If the INSERT fails with a uniqueness violation, we shrug. The ERROR
has caused the subtransaction to abort.

5. Process WHEN MATCHED clauses and continue.

Technically, this is a standards violation because of the potentially
out-of-order execution of the when clauses. Though the end result is not
distinguishable from standards compliant behaviour, AFAICS.

Note that in step 3 we *must* use subtransactions if there is more than
1 unique index on a table, otherwise we might succeed with the first
index and fail with the second. Using a subtransaction per row pretty
much rules out an efficient bulk load.

Note also that this results in a version optimised for INSERT. If we end
up doing an UPDATE there will be two dead rows, probably in two separate
blocks. We hope that doesn't matter because of HOT.

There's probably a reasonable argument for having an optional keyword to
make MERGE behave differently for bulk loads, but I'll save that now for
another day. Focus now is on a command that works well for OLTP cases.

--  Simon Riggs 2ndQuadrant  http://www.2ndQuadrant.com



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Proposed patch - psql wraps at window width
Next
From: Chris Browne
Date:
Subject: Re: MERGE Specification