Re: BUG #16961: Could not access status of transaction - Mailing list pgsql-bugs

From Noah Misch
Subject Re: BUG #16961: Could not access status of transaction
Date
Msg-id 20210605205545.GE228552@rfd.leadboat.com
Whole thread Raw
In response to RE: BUG #16961: Could not access status of transaction  (Stepan Yankevych <Stepan_Yankevych@epam.com>)
Responses Re: BUG #16961: Could not access status of transaction  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
On Mon, Apr 19, 2021 at 06:49:15PM +0000, Stepan Yankevych wrote:
> The database was rebooted and the issue disappeared

Each postmaster restart clears the LISTEN/NOTIFY queue, so that fits.

> The issue happening each morning when application starts on the production DataBase during about a month.
> Always the same transaction id is mentioned in the error (1954017648)
> We tried to do UNLISTEN - no changes. the same issue.
> LISTEN works good for any other channels.
> 
> 
> Can it be related to some hanged transaction? 1954017648? (for example while some network interruption)

Something like that; see below.

> Is it possible to kill/clean it somehow without DB restart?

Not to my knowledge.

> Can it be related to some non-vacuumed system table or so?

Probably not.

> Command  - LISTEN missed_trades_empty_instrument
> 
> ERROR:  could not access status of transaction 1954017648
> DETAIL:  Could not open file "pg_xact/0747": No such file or directory.
> STATEMENT:  LISTEN missed_trades_empty_instrument

The LISTEN/NOTIFY queue stores a transaction id for each notification
(internally, each AsyncQueueEntry).  I can imagine the "could not access
status" happening if a sequence of events like this happened since the last
postmaster restart:

backend 1: LISTEN missed_trades_empty_instrument
backend 2: BEGIN; NOTIFY missed_trades_empty_instrument [TransactionId N]
backend 1: BEGIN
backend 2: COMMIT
backend 1: CREATE TEMP TABLE x (); [sets TransactionId N+K]
autovacuum: freezes tuples, deletes pg_xact data covering TransactionId N
backend 1: COMMIT

On the other hand, if that's what happened, your report that "LISTEN works
good for any other channels" surprises me.  Perhaps something completely
different happened on your system.

We could prevent the trouble if vac_truncate_clog() had access to the oldest
xid in the notification queue; it would set frozenXID to that value if
frozenXID would otherwise be older.



pgsql-bugs by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: logical decoding bug: segfault in ReorderBufferToastReplace()
Next
From: Tom Lane
Date:
Subject: Re: BUG #16961: Could not access status of transaction