Re: BUG #7623: Inconsistency on transaction isolation documentation - Mailing list pgsql-bugs
From | Louis-Claude Canon |
---|---|
Subject | Re: BUG #7623: Inconsistency on transaction isolation documentation |
Date | |
Msg-id | 508FD7A8.9010808@femto-st.fr Whole thread Raw |
In response to | Re: BUG #7623: Inconsistency on transaction isolation documentation ("Kevin Grittner" <kgrittn@mail.com>) |
List | pgsql-bugs |
Thanks for your helpful explanations. Reading subsection 13.2.1 makes=20 sense now. I am not sure if and how the documentation could be improved,=20 but I propose below some suggestions that you may find relevant. My=20 confusion may have stem from my lack of knowledge in PostgreSQL locking=20 mechanism. Le 29/10/2012 22:42, Kevin Grittner a =C3=A9crit : > louis-claude.canon@univ-fcomte.fr wrote: > >> In the first paragraph of Section 13.2.1, it is first stated that >> "[with] this isolation level, a SELECT query [...] never sees [...] >> changes committed during query execution by concurrent >> transactions." It is my understanding that this is untrue (as the >> rest of the paragraph seems to imply). > What are you seeing which you feels contradicts that? (I don't see > anything which appears to contradict it, but maybe there's more than > one way for someone to read it, and I'm missing something.) My mistake, I interpreted "during query execution" as "during the=20 current transaction execution". It could be more precise to replace "it=20 never sees either uncommitted data or changes committed during query=20 execution by concurrent transactions" by one of the following : - "it never sees either uncommitted data or changes committed during=20 _this_ query execution by concurrent transactions" - "_this query_ never sees either uncommitted data or changes committed=20 during _its_ execution by concurrent transactions" >> Then, in the same subsection, an example is given in which a DELETE >> statement is affected by an uncommitted UPDATE. > No, the DELETE statement is blocked by the UPDATE to see whether it > commits or rolls back. If the transaction with the UPDATE rolls back, > the results of the DELETE will not be affected by the UPDATE. If the > UPDATE commits, the committed transaction will affect the DELETE. > >> It is unclear to me if this still corresponds to the standard >> "Committed Read" isolation level. > Yes, it's still in section 13.2.1, which is all about READ COMMITTED > transactions. And I believe that it still complies with the > requirements of the standard for that level. > >> It could be clearer with more precision: when is committed the >> DELETE statement? > I guess it could be more clear that the DELETE will block pending the > completion of the transaction containing the previously run UPDATE. > The commit of the DELETE will be some time after the commit of the > transaction containing the UPDATE. > >> What is the snapshot that it sees (when does the transaction >> start)? Or maybe this standard isolation level only specified that >> SELECT statements must not be affected by uncommitted changes? > No transaction is affected (other than blocking or possibly rolling > back with a serialization failure), by an *uncommitted* transaction. > In this example the DELETE is affected by a *committed* transaction. > The snapshot for the DELETE is taken after the query is parsed and > before it starts scanning for rows to delete. When there is a write > conflict in READ COMMITTED it follows the update chain from the > visible row to find the latest committed version of the row, blocking > for a change in progress; so technically an UPDATE or DELETE at the > READ COMMITTED transaction isolation level sometimes strays from a > single snapshot for the statement. The alternative would be to > restart the statement after a write conflict where the other > transaction commits, which could perform much worse when a statement > affects a large number of rows. A writing query may therefore find rows by reading some committed=20 values, wait for a writing lock and then retest if the write should be=20 performed based on values from a subsequent commit. This is not=20 intuitive that data accessed from a single query may correspond to=20 several commits. The second paragraph could be started by something like=20 "Although, any single query will see only committed values, a writing=20 query may see an inconsistent state of the database" for a better=20 introducing this inconsistent idea. Also, as locking is described later=20 (section 13.3), the UPDATE/DELETE example explanation (the sentence=20 "This occurs because [...] the criteria.") may be too synthetic. I=20 propose to extend it by stating which snapshot are considered and how=20 operations are sequentially performed. For example: "The DELETE query=20 selects from the committed database the row in which hits=3D10. As this=20 row is locked by the UPDATE (see ROW EXCLUSIVE lock in Section 13.3),=20 the DELETE is blocked and wait for the COMMIT to occurs, which will free=20 the lock. At this point, the DELETE proceeds by retesting if its=20 selected row must be deleted. This is no longer the case as the DELETE=20 snapshot of the row results from the last COMMIT that incremented hits=20 to 11." >> As I am not an expert, I cannot guarantee that there is indeed any >> issue. > Well, based on what you said there appears to be at lest some room to > clarify or elaborate, so people better understand it. > >> However, I believe it could be useful that an expert proof-reads or >> reviews this subsection. For example, I am unsure whether any query >> will see a consistent snapshot with only committed transactions (as >> it is stated at first) or if uncommitted transactions may have an >> impact (as the last example suggests). > READ COMMITTED can only see work of committed transactions, but does > not necessarily see only data visible according to its snapshot. > >> I would also suggest to following change for clarity (or the >> contrary change): >> "The point at issue above is whether or not a single command sees >> an absolutely consistent view of the database." -> "The point at >> issue above is that any single command sees an absolutely >> consistent view of the database." > Based on my clarification here, do you agree that the current > language is more accurate than your proposed revision? I would propose instead the opposite change: "The point at issue above=20 is that any single command does not always see an absolutely consistent=20 view of the database." to emphasize the risk of this isolation level=20 (the current "whether or not" does not highlight which of the=20 alternatives is actually true). Regards, Louis-Claude Canon
pgsql-bugs by date: