Re: Re: BUG #9301: INSERT WHERE NOT EXISTS on table with UNIQUE constraint in concurrent SERIALIZABLE transactions - Mailing list pgsql-bugs

From Kevin Grittner
Subject Re: Re: BUG #9301: INSERT WHERE NOT EXISTS on table with UNIQUE constraint in concurrent SERIALIZABLE transactions
Date
Msg-id 1394826194.18022.YahooMailNeo@web122305.mail.ne1.yahoo.com
Whole thread Raw
In response to Re: BUG #9301: INSERT WHERE NOT EXISTS on table with UNIQUE constraint in concurrent SERIALIZABLE transactions  (David Johnston <polobo@yahoo.com>)
List pgsql-bugs
David Johnston <polobo@yahoo.com> wrote:=0A> Tom Lane-2 wrote=0A>> mscott@ =
writes:=0A=0A>>> -- The following seems to violate the fundamental guarante=
e of=0A>>> ISOLATION LEVEL SERIALIZABLE in that the two transactions below=
=0A>>> do not behave the same as if they were run serially.=0A>>=0A>> I mus=
t be missing something ... what about the duplicate-key=0A>> error is not w=
hat you would expect if the two transactions had=0A>> been run serially?=0A=
>>=0A>> BTW, the fundamental guarantee is not what you said above.=C2=A0 It=
=0A>> is that you will *either* get the same results as if the=0A>> transac=
tions had been run in some unspecified serial order, *or*=0A>> you will get=
 a serialization error.=C2=A0 There is certainly no=0A>> guarantee about wh=
ich of these cases ensues.=0A=0A> My reading is that if run serially the se=
cond function call=0A> results in a select returning zero records (exists r=
eturns true,=0A> then not-ed to a constant false in the where clause) and t=
hus no=0A> insert is attempted and thus no duplicate key error - the record=
=0A> from the first transaction remains as-is and the second is=0A> effecti=
vely a no-op.=0A>=0A> So the fact the serializable versions fails is correc=
t but it=0A> fails "incorrectly" since a serial execution could never fail=
=0A> with a duplicate key error.=C2=A0 Thus neither a possible unspecified=
=0A> serial order result (dup being an impossible one) nor a=0A> serializat=
ion error occurred which is contrary to the guarantee=0A> that you state.=
=0A>=0A> Now, the fact is the serialization case correctly fails but from=
=0A> an API perspective immediate retry makes no sense for duplicate=0A> ke=
y failure while it does for serialization failure.=0A=0AThis issue has been=
 noted in the literature since at least 2007[1].=0AIn the static code analy=
sis of academic and financial systems at=0AIndian Institute of Technology B=
ombay which had been relying on=0Asnapshot isolation for transactional inte=
grity they initially=0Aflagged many possible sources of data corruption fro=
m serialization=0Afailure which turned out not to actually be possible due =
to primary=0Akeys, unique indexes, or foreign keys.=C2=A0 After considering=
 such=0Aissues they were left with only two real risks in each of the=0Asys=
tems, and were only able to find actual data corruption of one=0Aof those i=
n each system.=0A=0AIf there were a cheap and reliable way to turn violatio=
ns of these=0Aconstraints into serialization failures where appropriate, I =
agree=0Athat it would be better to do so.=C2=A0 I have not been able to see=
 any=0Away to do that, but am all ears if anyone else has an idea.=C2=A0 (N=
ote=0Athe constraints of cheap and reliable; we could allow the SSI=0Amecha=
nism to generate SIReadLocks during constraint enforcement,=0Abut it's not =
clear that it would always find a serialization error=0Abefore the constrai=
nt caused an error, and the overhead might be=0Asignificant.)=0A=0AThe actu=
al guarantee provided is more like: "The behavior of any=0Aset of successfu=
lly committed concurrent serializable transactions=0Awill be consistent wit=
h some unspecified serial order of execution=0Aof those transactions."=C2=
=A0 Constraints may terminate a transaction=0Abefore the serializable snaph=
sot isolation mechanism is able to=0Arecognize a problem and cancel a trans=
action with a different=0ASQLSTATE, but the guarantee should prevent any se=
rialization=0Aanomalies from appearing in the database.=C2=A0 It does prese=
nt a=0Achallenge in terms of knowing whether a constraint validation like=
=0A"duplicate key" indicates a situation which can be automatically=0Are-tr=
ied.=C2=A0 If that is of paramount importance one could implement=0Athe log=
ic for preventing duplicate keys or enforcing foreign keys=0Ain triggers us=
ing serializable transactions, but that would come=0Awith a heavy price in =
both maintenance effort and performance.=0A=0A--=0AKevin Grittner=0AEDB: ht=
tp://www.enterprisedb.com=0AThe Enterprise PostgreSQL Company=0A=0A[1] http=
://www.vldb.org/conf/2007/papers/industrial/p1263-jorwekar.pdf=0AAutomating=
 the Detection of Snapshot Isolation Anomalies=0Aby Sudhir Jorwekar, Krithi=
 Ramamritham, Alan Fekete, S. Sudarshan=0AVLDB =E2=80=9807, September 23-28=
, 2007, Vienna, Austria.

pgsql-bugs by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: BUG #9555: pg_dump for tables with inheritance recreates the table with the wrong order of columns
Next
From: Tom Lane
Date:
Subject: Re: BUG #9551: Hang in State "authentication" Prevents Vacuum from Freeing Dead Rows