Thread: Re: BUG #7623: Inconsistency on transaction isolation documentation

Re: BUG #7623: Inconsistency on transaction isolation documentation

From
"Kevin Grittner"
Date:
louis-claude.canon@univ-fcomte.fr wrote:

> In the first paragraph of Section 13.2.1, it is first stated that
> "[with] this isolation level, a SELECT query [...] never sees [...]
> changes committed during query execution by concurrent
> transactions." It is my understanding that this is untrue (as the
> rest of the paragraph seems to imply).

What are you seeing which you feels contradicts that? (I don't see
anything which appears to contradict it, but maybe there's more than
one way for someone to read it, and I'm missing something.)

> Then, in the same subsection, an example is given in which a DELETE
> statement is affected by an uncommitted UPDATE.

No, the DELETE statement is blocked by the UPDATE to see whether it
commits or rolls back. If the transaction with the UPDATE rolls back,
the results of the DELETE will not be affected by the UPDATE. If the
UPDATE commits, the committed transaction will affect the DELETE.

> It is unclear to me if this still corresponds to the standard
> "Committed Read" isolation level.

Yes, it's still in section 13.2.1, which is all about READ COMMITTED
transactions. And I believe that it still complies with the
requirements of the standard for that level.

> It could be clearer with more precision: when is committed the
> DELETE statement?

I guess it could be more clear that the DELETE will block pending the
completion of the transaction containing the previously run UPDATE.
The commit of the DELETE will be some time after the commit of the
transaction containing the UPDATE.

> What is the snapshot that it sees (when does the transaction
> start)? Or maybe this standard isolation level only specified that
> SELECT statements must not be affected by uncommitted changes?

No transaction is affected (other than blocking or possibly rolling
back with a serialization failure), by an *uncommitted* transaction.
In this example the DELETE is affected by a *committed* transaction.
The snapshot for the DELETE is taken after the query is parsed and
before it starts scanning for rows to delete. When there is a write
conflict in READ COMMITTED it follows the update chain from the
visible row to find the latest committed version of the row, blocking
for a change in progress; so technically an UPDATE or DELETE at the
READ COMMITTED transaction isolation level sometimes strays from a
single snapshot for the statement. The alternative would be to
restart the statement after a write conflict where the other
transaction commits, which could perform much worse when a statement
affects a large number of rows.

> As I am not an expert, I cannot guarantee that there is indeed any
> issue.

Well, based on what you said there appears to be at lest some room to
clarify or elaborate, so people better understand it.

> However, I believe it could be useful that an expert proof-reads or
> reviews this subsection. For example, I am unsure whether any query
> will see a consistent snapshot with only committed transactions (as
> it is stated at first) or if uncommitted transactions may have an
> impact (as the last example suggests).

READ COMMITTED can only see work of committed transactions, but does
not necessarily see only data visible according to its snapshot.

> I would also suggest to following change for clarity (or the
> contrary change):
> "The point at issue above is whether or not a single command sees
> an absolutely consistent view of the database." -> "The point at
> issue above is that any single command sees an absolutely
> consistent view of the database."

Based on my clarification here, do you agree that the current
language is more accurate than your proposed revision?

-Kevin

Re: BUG #7623: Inconsistency on transaction isolation documentation

From
Louis-Claude Canon
Date:
Thanks for your helpful explanations. Reading subsection 13.2.1 makes=20
sense now. I am not sure if and how the documentation could be improved,=20
but I propose below some suggestions that you may find relevant. My=20
confusion may have stem from my lack of knowledge in PostgreSQL locking=20
mechanism.

Le 29/10/2012 22:42, Kevin Grittner a =C3=A9crit :
> louis-claude.canon@univ-fcomte.fr wrote:
>
>> In the first paragraph of Section 13.2.1, it is first stated that
>> "[with] this isolation level, a SELECT query [...] never sees [...]
>> changes committed during query execution by concurrent
>> transactions." It is my understanding that this is untrue (as the
>> rest of the paragraph seems to imply).
> What are you seeing which you feels contradicts that? (I don't see
> anything which appears to contradict it, but maybe there's more than
> one way for someone to read it, and I'm missing something.)

My mistake, I interpreted "during query execution" as "during the=20
current transaction execution". It could be more precise to replace "it=20
never sees either uncommitted data or changes committed during query=20
execution by concurrent transactions" by one of the following :
- "it never sees either uncommitted data or changes committed during=20
_this_ query execution by concurrent transactions"
- "_this query_ never sees either uncommitted data or changes committed=20
during _its_ execution by concurrent transactions"

>> Then, in the same subsection, an example is given in which a DELETE
>> statement is affected by an uncommitted UPDATE.
> No, the DELETE statement is blocked by the UPDATE to see whether it
> commits or rolls back. If the transaction with the UPDATE rolls back,
> the results of the DELETE will not be affected by the UPDATE. If the
> UPDATE commits, the committed transaction will affect the DELETE.
>
>> It is unclear to me if this still corresponds to the standard
>> "Committed Read" isolation level.
> Yes, it's still in section 13.2.1, which is all about READ COMMITTED
> transactions. And I believe that it still complies with the
> requirements of the standard for that level.
>
>> It could be clearer with more precision: when is committed the
>> DELETE statement?
> I guess it could be more clear that the DELETE will block pending the
> completion of the transaction containing the previously run UPDATE.
> The commit of the DELETE will be some time after the commit of the
> transaction containing the UPDATE.
>
>> What is the snapshot that it sees (when does the transaction
>> start)? Or maybe this standard isolation level only specified that
>> SELECT statements must not be affected by uncommitted changes?
> No transaction is affected (other than blocking or possibly rolling
> back with a serialization failure), by an *uncommitted* transaction.
> In this example the DELETE is affected by a *committed* transaction.
> The snapshot for the DELETE is taken after the query is parsed and
> before it starts scanning for rows to delete. When there is a write
> conflict in READ COMMITTED it follows the update chain from the
> visible row to find the latest committed version of the row, blocking
> for a change in progress; so technically an UPDATE or DELETE at the
> READ COMMITTED transaction isolation level sometimes strays from a
> single snapshot for the statement. The alternative would be to
> restart the statement after a write conflict where the other
> transaction commits, which could perform much worse when a statement
> affects a large number of rows.

A writing query may therefore find rows by reading some committed=20
values, wait for a writing lock and then retest if the write should be=20
performed based on values from a subsequent commit. This is not=20
intuitive that data accessed from a single query may correspond to=20
several commits. The second paragraph could be started by something like=20
"Although, any single query will see only committed values, a writing=20
query may see an inconsistent state of the database" for a better=20
introducing this inconsistent idea. Also, as locking is described later=20
(section 13.3), the UPDATE/DELETE example explanation (the sentence=20
"This occurs because [...] the criteria.") may be too synthetic. I=20
propose to extend it by stating which snapshot are considered and how=20
operations are sequentially performed. For example: "The DELETE query=20
selects from the committed database the row in which hits=3D10. As this=20
row is locked by the UPDATE (see ROW EXCLUSIVE lock in Section 13.3),=20
the DELETE is blocked and wait for the COMMIT to occurs, which will free=20
the lock. At this point, the DELETE proceeds by retesting if its=20
selected row must be deleted. This is no longer the case as the DELETE=20
snapshot of the row results from the last COMMIT that incremented hits=20
to 11."

>> As I am not an expert, I cannot guarantee that there is indeed any
>> issue.
> Well, based on what you said there appears to be at lest some room to
> clarify or elaborate, so people better understand it.
>
>> However, I believe it could be useful that an expert proof-reads or
>> reviews this subsection. For example, I am unsure whether any query
>> will see a consistent snapshot with only committed transactions (as
>> it is stated at first) or if uncommitted transactions may have an
>> impact (as the last example suggests).
> READ COMMITTED can only see work of committed transactions, but does
> not necessarily see only data visible according to its snapshot.
>
>> I would also suggest to following change for clarity (or the
>> contrary change):
>> "The point at issue above is whether or not a single command sees
>> an absolutely consistent view of the database." -> "The point at
>> issue above is that any single command sees an absolutely
>> consistent view of the database."
> Based on my clarification here, do you agree that the current
> language is more accurate than your proposed revision?

I would propose instead the opposite change: "The point at issue above=20
is that any single command does not always see an absolutely consistent=20
view of the database." to emphasize the risk of this isolation level=20
(the current "whether or not" does not highlight which of the=20
alternatives is actually true).

Regards,

Louis-Claude Canon