Thread: Re:Fwd: BUG #17017: Two versions of the same row of records are returned in one query

Hi Tom,
I think the repeatable read transaction should give priority to see the new version written by itself, e.g., the insertion of the above example. There is no need to return the old version written by committed transactions. The query result fo Session B is very confusing!

------------------ Original ------------------
Date:  Wed, May 19, 2021 09:05 AM
To:  "52194501011"<52194501011@stu.ecnu.edu.cn>;
Subject:  Fwd: BUG #17017: Two versions of the same row of records are returned in one query
 


---------- Forwarded message ---------
发件人: Tom Lane <tgl@sss.pgh.pa.us>
Date: 2021年5月19日周三 00:18
Subject: Re: BUG #17017: Two versions of the same row of records are returned in one query
To: <lpy.henu@gmail.com>
Cc: <pgsql-bugs@lists.postgresql.org>


PG Bug reporting form <noreply@postgresql.org> writes:
> Schema and Initial data:
> Create Table t(a int primary key, b int);
> Insert into t values(1,2);
> Insert into t values(2,3);

> [Time0, SessonA]
>> Begin;
>> set transaction isolation level repeatable read;
>> Select * from t where a=1;

> [Time1, SessonB]
>> Begin;
>> set transaction isolation level read committed;
>> Delete from t where a=2;
>> Commit;

> [Time2, SessonA]
>> Insert into t values(2,4);
>> Select * from t where a=2;

> Here, we expect PostgreSQL Server to return a row:
> 2 3

> However, it returns two rows:
> 2 4
> 2 3

In repeatable read mode, this doesn't surprise me.  Session A
surely must return the (2,3) row, since it isn't supposed to
"see" the results of Session B's commit yet.  And it would be
mighty surprising for it not to see its own insertion, so you
get (2,4) as well.

In serializable mode, I'd expect this situation to throw a
serialization error, and it does (but you must run BOTH
transactions in serializable mode; there are not guarantees
with a mix of serializable and non-serializable transactions).

So I think the answer is "if you care about this, use
serializable mode".  Repeatable read mode meets the
requirements of the SQL spec, but it doesn't guarantee
no anomalies.

                        regards, tom lane
On Wed, May 19, 2021 at 12:27 AM 李可强 <52194501011@stu.ecnu.edu.cn> wrote:
> There is no need to return the old version written by committed transactions. The query result fo Session B is very
confusing!

FWIW I think that it's a matter of how the standard is interpreted.
You expect something that Kyle Kingsbury called "Adya repeatable
read". This interpretation of RR allows an application which only
accesses records by primary key to get serializable execution.
Essentially, the primary key is special in a system that uses
traditional two-phase locking (or something closer to it than the
Postgres model). But a predicate on the PK is not special in Postgres.

--
Peter Geoghegan



Hi Peter, 

Thank you for your kind response.  If possible, I want to discuss more about my confusion on this problem. 

You have mentioned that "a predicate on the PK is not special in Postgres". Does it mean that PG takes PK as a common column instead of the identifier of rows if selecting rows by a predicate on the PK? If it is so, in a relational database, PG may break the unique constraints on PK.  

For example, in my mentioned bug case, the query on PG returns two rows which have the same PK. From my point of view, such a behavior violates the primary key unique constraint. 

Look forward to the response.

Thank you.
 
 
------------------ Original ------------------
From: "Peter Geoghegan";
Date: 2021年5月20日(星期四) 上午9:56
To: "52194501011@stu.ecnu.edu.cn"<52194501011@stu.ecnu.edu.cn>;
Cc: "tgl"; "pgsql-bugs";
Subject: Re: Fwd: BUG #17017: Two versions of the same row of records are returned in one query
 
On Wed, May 19, 2021 at 12:27 AM 李可强 <52194501011@stu.ecnu.edu.cn> wrote:
> There is no need to return the old version written by committed transactions. The query result fo Session B is very confusing!

FWIW I think that it's a matter of how the standard is interpreted.
You expect something that Kyle Kingsbury called "Adya repeatable
read". This interpretation of RR allows an application which only
accesses records by primary key to get serializable execution.
Essentially, the primary key is special in a system that uses
traditional two-phase locking (or something closer to it than the
Postgres model). But a predicate on the PK is not special in Postgres.

--
Peter Geoghegan
On Thu, Oct 6, 2022 at 12:28 AM 李可强 <52194501011@stu.ecnu.edu.cn> wrote:
> You have mentioned that "a predicate on the PK is not special in Postgres". Does it mean that PG takes PK as a common
columninstead of the identifier of rows if selecting rows by a predicate on the PK? 

I'm not 100% sure that I understand this question, but I think that
the answer is yes.

> For example, in my mentioned bug case, the query on PG returns two rows which have the same PK. From my point of
view,such a behavior violates the primary key unique constraint. 

I believe that your argument is correct (it is wrong) according to some
interpretations of "repeatable read". What Postgres calls repeatable
read is actually snapshot isolation.

The simple fact is that the terminology in this area is confusing, and
it seems as if nobody agrees on all the details. It is generally
understood that the behavior that Postgres calls "repeatable read" is
actually snapshot isolation (same is true of Oracle). So the behavior
that you find surprising is not surprising to us, and is not
considered a bug. I am sympathetic, but it's not going to change now.

This point is made clear in the documentation:

https://www.postgresql.org/docs/current/transaction-iso.html#XACT-REPEATABLE-READ

"A Critique of ANSI SQL Isolation Levels" by Berenson et al explains
this controversy in great detail.


--
Peter Geoghegan