Re: Repeatable read and serializable transactions see data committed after tx start - Mailing list pgsql-hackers
From | Álvaro Hernández Tortosa |
---|---|
Subject | Re: Repeatable read and serializable transactions see data committed after tx start |
Date | |
Msg-id | 545E819C.9040300@8Kdata.com Whole thread Raw |
In response to | Re: Repeatable read and serializable transactions see data committed after tx start (Kevin Grittner <kgrittn@ymail.com>) |
List | pgsql-hackers |
<br /><div class="moz-cite-prefix">On 06/11/14 15:00, Kevin Grittner wrote:<br /></div><blockquote cite="mid:1415282403.10098.YahooMailNeo@web122304.mail.ne1.yahoo.com"type="cite"><pre wrap="">Álvaro Hernández Tortosa <aclass="moz-txt-link-rfc2396E" href="mailto:aht@8Kdata.com"><aht@8Kdata.com></a> wrote: </pre><blockquote type="cite"><pre wrap=""> There has been two comments which seem to state that changing this may introduce some performance problems and some limitations when you need to take out some locks. I still believe, however, that current behavior is confusing for the user. Sure, one option is to patch the documentation, as I was suggesting. </pre></blockquote><pre wrap=""> Yeah, I thought that's what we were talking about, and in that regard I agree that the docs could be more clear. I'm not quite sure what to say where to fix that, but I can see how someone could be confused and have the expectation that once they have run BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE the transaction will not see the work of transactions committing after that. The fact that this is possible is implied, if one reads carefully and thinks about it, by the statement right near the start of the "Transaction Isolation" section which says "any concurrent execution of a set of Serializable transactions is guaranteed to produce the same effect as running them one at a time in some order." As Robert pointed out, this is not necessarily the commit order or the transaction start order. It is entirely possible that if you have serializable transactions T1 and T2, where T1 executes BEGIN first (and even runs a query before T2 executes BEGIN) and T1 commits first, that T2 will "appear" to have run first because it will look at a set of data which T1 modifies and not see the changes. If T1 were to *also* look at a set of data which T2 modifies, then one of the transactions would be rolled back with a serialization failure, to prevent a cycle in the apparent order of execution; so the requirements of the standard (and of most software which is attempting to handle race conditions) is satisfied. For many popular benchmarks (and I suspect most common workloads) this provides the necessary protections with better performance than is possible using blocking to provide the required guarantees.[1]</pre></blockquote><br /> Yes, you're right in that the"any concurrent execution..." phrase implicitly means that snapshot may not be taken at BEGIN or SET TRANSACTION time,but it's definitely not clear enough for the average user. Yet this may apply to the serializable case, but it doesn'tto the repeatable read where the docs read " The Repeatable Read isolation level only sees data committed before thetransaction began; it never sees either uncommitted data or changes committed during transaction execution by concurrenttransactions". The first part is confusing, as we discussed; the second part is even more confusing as it says"during transaction execution", and isn't the transaction -not the snapshot- beginning at BEGIN time?<br /><br /> Surprisingly, the language is way more clear in the SET TRANSACTION doc page [2].<br /> <br /><blockquote cite="mid:1415282403.10098.YahooMailNeo@web122304.mail.ne1.yahoo.com"type="cite"><pre wrap=""> At any rate, the language in that section is a little fuzzy on the concept of the "start of the transaction." Perhaps it would be enough to change language like: | sees a snapshot as of the start of the transaction, not as of the | start of the current query within the transaction. to: | sees a snapshot as of the start of the first query within the | transaction, not as of the start of the current query within the | transaction. Would that have prevented the confusion here?</pre></blockquote><br /> I think that definitely helps. But it may be betterto make it even more clear, more explicit. And offering a solution for the user who may like the snapshot to be taken"at begin time", like suggesting to do a "SELECT 1" query.<br /><blockquote cite="mid:1415282403.10098.YahooMailNeo@web122304.mail.ne1.yahoo.com"type="cite"><pre wrap=""> </pre><blockquote type="cite"><pre wrap=""> But what about creating a flag to BEGIN and SET TRANSACTION commands, called "IMMEDIATE FREEZE" (or something similar), which applies only to REPEATABLE READ and SERIALIZABLE? If this flag is set (and may be off by default, but of course the default may be configurable via a guc parameter), freeze happens when it is present (BEGIN or SET TRANSACTION) time. This would be a backwards-compatible change, while would provide the option of freezing without the nasty hack of having to do a "SELECT 1" prior to your real queries, and everything will of course be well documented. </pre></blockquote><pre wrap=""> What is the use case where you are having a problem? This seems like an odd solution, so it would be helpful to know what problem it is attempting to solve.</pre></blockquote><br /> I don't have a particular use case. I just came across the issueand thought the documentation and behavior wasn't consistent. So the first aim is not to have users surprised (in abad way). But I see a clear use case: users who might want to open a (repeatable read | serializable) transaction to havetheir view of the database frozen, to perform any later operation on that frozen view. Sure, that comes at a penalty,but I see that potentially interesting too.<br /><br /> Regards,<br /> <br /> Álvaro<br /><br /><br /><br/> [1] <a class="moz-txt-link-freetext" href="http://www.postgresql.org/docs/9.4/static/transaction-iso.html">http://www.postgresql.org/docs/9.4/static/transaction-iso.html</a><br />[2] <a class="moz-txt-link-freetext" href="http://www.postgresql.org/docs/9.4/static/sql-set-transaction.html">http://www.postgresql.org/docs/9.4/static/sql-set-transaction.html</a><br /><preclass="moz-signature" cols="72">-- Álvaro Hernández Tortosa ----------- 8Kdata </pre>
pgsql-hackers by date: