Re: ask for review of MERGE - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: ask for review of MERGE |
Date | |
Msg-id | AANLkTikF4JH9=xu51XLBeWDPqXYpRLn1YFt6gvQBVCqf@mail.gmail.com Whole thread Raw |
In response to | Re: ask for review of MERGE ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>) |
Responses |
Re: ask for review of MERGE
Re: ask for review of MERGE |
List | pgsql-hackers |
On Mon, Oct 25, 2010 at 3:15 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Robert Haas <robertmhaas@gmail.com> wrote: > >> rhaas=# create table concurrent (x integer primary key); >> NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index >> "concurrent_pkey" for table "concurrent" >> CREATE TABLE >> rhaas=# insert into x values (1); >> rhaas=# begin; >> BEGIN >> rhaas=# insert into concurrent values (2); >> INSERT 0 1 >> >> <switch to a different window> >> >> rhaas=# update concurrent set x=x where x=2; >> UPDATE 0 > > That surprised me. I would have thought that the INSERT would have > created an "in doubt" tuple which would block the UPDATE. What is > the reason for not doing so? This is just standard MVCC - readers don't block writers, nor writers readers. You might also think about what would happen if the UPDATE were run before the INSERT of (2). There's no serialization anomaly here, because either concurrent case is equivalent to the serial schedule where the update precedes the insert. In the case of a MERGE that matches a just-inserted invisible tuple but no visible tuple, things are a bit stickier. Let's suppose we're trying to use MERGE to get UPSERT semantics. If the MERGE command has the obvious behavior of ignoring the invisible tuple just as UPDATE or DELETE would do, then clearly any equivalent serial schedule must run the MERGE before the INSERT (because if it were run after the INSERT, it would fire the MATCHED action rather than the NOT MATCHED action). But if the merge were run before the INSERT, then the INSERT would have failed with a unique key violation; instead, the merge fails with a unique key violation. On the other hand, if the MERGE sees the invisible tuple, essentially using SnapshotNow semantics, as Greg Stark proposed, you get a different (and probably worse) class of serialization anomalies. For example, suppose the table has rows 1-100 and you do an update adding 1000 to each value concurrently with merging in the values 51-100. You might get something like this: - MERGE scans rows 1-75, firing MATCHED for rows 51-75. - UPDATE commits. - MERGE scans rows 76-100, firing NOT MATCHED for each. Now, as Greg says, that might be what some people want, but it's certainly monumentally unserializable. In a serial execution schedule, the MERGE will either run before the UPDATE, in which case MATCHED will fire for rows 51-100, or else the UPDATE will run before the MERGE, in which case NOT MATCHED will fire for rows 51-100. No serial schedule is going to fire MATCHED for some rows and NOT MATCHED for others. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: