Thread: Repeatable read transaction doesn't see dropped table

Repeatable read transaction doesn't see dropped table

From

Daniil Davydov

Date:

23 December 2024, 10:44:15

Hi,
The documentation for PostgreSQL 17 says the following :
"query in a repeatable read transaction sees a snapshot as of the
start of the first non-transaction-control statement in the
transaction, not as of the start of the current statement within the
transaction"

But I noticed this behavior (REL_17_STABLE):
***
SESSION 1: create two user tables and fill them with data
CREATE TABLE test (id INT);
CREATE TABLE test_1 (id INT);
INSERT INTO test VALUES (1);
INSERT INTO test_1 VALUES (1);

SESSION 2 : begin transaction and allow it to take snapshot
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT * FROM test_1;

SESSION 1 : drop table, that was not accessed from second session
DROP TABLE test;

SESSION 2 :
SELECT * FROM test;
***

If I'm not mistaken, second transaction must see all data in table
test (according to documentation), but an error occurs:
***
ERROR:  relation "test" does not exist
LINE 1: select * from test;
***

We are getting this behavior due to the fact that the second session
searching for table's oid in cache via RelnameGetRelid,
but first session already invalidated it. It seems like a bug to me,
so I suggest in such cases to additionally scan pg_class.

I would like to know your opinion.

--
Best regards,
Daniil Davydov

Re: Repeatable read transaction doesn't see dropped table

From

"David G. Johnston"

Date:

23 December 2024, 11:17:35

On Monday, December 23, 2024, Daniil Davydov <3danissimo@gmail.com> wrote:

Hi,
The documentation for PostgreSQL 17 says the following :
"query in a repeatable read transaction sees a snapshot as of the
start of the first non-transaction-control statement in the
transaction, not as of the start of the current statement within the
transaction"

But I noticed this behavior (REL_17_STABLE):
***
SESSION 1: create two user tables and fill them with data
CREATE TABLE test (id INT);
CREATE TABLE test_1 (id INT);
INSERT INTO test VALUES (1);
INSERT INTO test_1 VALUES (1);

SESSION 2 : begin transaction and allow it to take snapshot
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT * FROM test_1;

SESSION 1 : drop table, that was not accessed from second session
DROP TABLE test;

SESSION 2 :
SELECT * FROM test;
***

If I'm not mistaken, second transaction must see all data in table
test (according to documentation), but an error occurs:

I would like to know your opinion.

The quoted section describes how two consecutive select queries will see the same data. Your example shows how a single query behaves in isolation. The “as the first query saw it” is fundamentally important since until it successfully executes there are no locks being held restricting the changing of non-data structural aspects of the database. In short, the snapshot doesn’t include an object until it is requested. It’s a repeatable read, not a frozen point-in-time read. The performance implications for the later would be unacceptable.

Thus, the behavior is expected and needed as-is; but I would say that the concurrency control chapter of the documentation is one of the harder to actually learn and understand. It is a challenging topic, so I get why. In its defense, the commentary surrounding the regarding control record and detail does try to make this distinction clear to the reader. YMMV as to its effectiveness in this regard.

David J.

Re: Repeatable read transaction doesn't see dropped table

From

Heikki Linnakangas

Date:

23 December 2024, 13:01:26

On 23/12/2024 10:17, David G. Johnston wrote:
> On Monday, December 23, 2024, Daniil Davydov <3danissimo@gmail.com 
> <mailto:3danissimo@gmail.com>> wrote:
> 
>     Hi,
>     The documentation for PostgreSQL 17 says the following :
>     "query in a repeatable read transaction sees a snapshot as of the
>     start of the first non-transaction-control statement in the
>     transaction, not as of the start of the current statement within the
>     transaction"
> 
>     But I noticed this behavior (REL_17_STABLE):
>     ***
>     SESSION 1: create two user tables and fill them with data
>     CREATE TABLE test (id INT);
>     CREATE TABLE test_1 (id INT);
>     INSERT INTO test VALUES (1);
>     INSERT INTO test_1 VALUES (1);
> 
>     SESSION 2 : begin transaction and allow it to take snapshot
>     BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
>     SELECT * FROM test_1;
> 
>     SESSION 1 : drop table, that was not accessed from second session
>     DROP TABLE test;
> 
>     SESSION 2 :
>     SELECT * FROM test;
>     ***
> 
>     If I'm not mistaken, second transaction must see all data in table
>     test (according to documentation), but an error occurs:
> 
>     I would like to know your opinion.
> 
> 
> The quoted section describes how two consecutive select queries will see 
> the same data.  Your example shows how a single query behaves in 
> isolation.  The “as the first query saw it” is fundamentally important 
> since until it successfully executes there are no locks being held 
> restricting the changing of non-data structural aspects of the 
> database.  In short, the snapshot doesn’t include an object until it is 
> requested.  It’s a repeatable read, not a frozen point-in-time read.  
> The performance implications for the later would be unacceptable.
> 
> Thus, the behavior is expected and needed as-is; but I would say that 
> the concurrency control chapter of the documentation is one of the 
> harder to actually learn and understand.  It is a challenging topic, so 
> I get why.  In its defense, the commentary surrounding the regarding 
> control record and detail does try to make this distinction clear to the 
> reader. YMMV as to its effectiveness in this regard.

Another way to say that is that the snapshot applies to table contents, 
but not the schema. Here's another surprising example:

session 2: establish snapshot
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT 123;

session 1:
CREATE TABLE test AS SELECT 2 as id;

session 2:
SELECT * FROM test;
  id
----
(0 rows)

Session 2 sees the table that was created concurrently, but not its 
contents.

-- 
Heikki Linnakangas
Neon (https://neon.tech)

Re: Repeatable read transaction doesn't see dropped table

From

Daniil Davydov

Date:

23 December 2024, 13:11:59

On Mon, Dec 23, 2024 at 3:17 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:

> The quoted section describes how two consecutive select queries will see the same data.  Your example shows how a
singlequery behaves in isolation.  The “as the first query saw it” is fundamentally important since until it
successfullyexecutes there are no locks being held restricting the changing of non-data structural aspects of the
database. In short, the snapshot doesn’t include an object until it is requested.  It’s a repeatable read, not a frozen
point-in-timeread.  The performance implications for the later would be unacceptable. 
>
> Thus, the behavior is expected and needed as-is; but I would say that the concurrency control chapter of the
documentationis one of the harder to actually learn and understand.  It is a challenging topic, so I get why.  In its
defense,the commentary surrounding the regarding control record and detail does try to make this distinction clear to
thereader. YMMV as to its effectiveness in this regard. 
>
> David J.
>

Thank you for your comments.

OK, I agree that there are no contradictions from the point of view of
the source code. But essentially, snapshot is just a range of xids,
and we must not see changes of transactions that started after the
snapshot was taken.
As far as I understand, documentation says that repeatable read
transactions take a snapshot at first non-transaction-control
statement, and this snapshot remains relevant for all statements
within the transaction.

My example shows that the second session sees changes that have been
made by a transaction that started after snapshot creation (and it
sees them only because of cache optimization). It might be unexpected
behavior for users.

I also agree that taking it into account will reduce performance, but
maybe we can clarify this aspect in documentation (?)

--
Best regards,
Daniil Davydov