Thread: BUG #18804: LISTEN on channel fails with "could not access status of transaction"

BUG #18804: LISTEN on channel fails with "could not access status of transaction"

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      18804
Logged by:          Andrei Varashen
Email address:      voroshen.av@gmail.com
PostgreSQL version: 16.3
Operating system:   Debian GNU/Linux 12 (bookworm)
Description:

Hi there, 

I've encountered the error while trying to register a listener with the
LISTEN channel_name statement in my production database:

STATEMENT: LISTEN river_leadership
ERROR: could not access status of transaction 2048841894
DETAIL: Could not open file "pg_xact/07A1": No such file or directory.

That also led to an error when attempting to register a listener for any
other channel in that database.

After some investigation, I found that similar issue has been reported
several times already, but none of the investigations cocluded with a
solution except for recommendation to restart postmaster in order to clean
up the LISTEN/NOTIFY queue (as far as I understood). This workaround works,
but the root cause of this behaviour remains unaddressed. Similar bug
reports:

1.
https://www.postgresql.org/message-id/16961-25f29f95b3604a8a%40postgresql.org
2.
https://www.postgresql.org/message-id/18394-e7459245148578b2@postgresql.org

I managed to reproduce the issue locally. 

Pre-conditions: 

1. postgres is deployed using docker. image:

https://hub.docker.com/layers/library/postgres/16.3/images/sha256-1b277d0af2273577d8547b295e3834baaf0c04f5b3b823882f686bac23502cf7

2. autovacuum is disabled (to avoid any automatic intervention), all other
properties use default values:

> cat postgres-test.conf | grep 'autovacuum = off'
autovacuum = off

3. test table is created in postgres database (used later to generate a high
number of update transactions):

postgres=# create table test(id int);
CREATE TABLE
postgres=# insert into test values (1);
INSERT 0 1


Steps to reproduce:

1. listen to the channel (backend 1): 

postgres=# LISTEN test_chan;
LISTEN

2. notify the channel (backend 2): 

postgres=# NOTIFY test_chan;
NOTIFY
postgres=# select txid_current();
 txid_current
--------------
          734
(1 row)

3. receive notification (backend 1):

postgres=# LISTEN test_chan;
LISTEN
Asynchronous notification "test_chan" received from server process with PID
86.

4. execute a simple update command on test table (from the pre-conditions
section) using pgbench in order to generate enough transactions to create
pg_xact/0001 file:

> pgbench --version

                                                     
pgbench (PostgreSQL) 16.3 (Debian 16.3-1.pgdg120+1)
> cat test.sql
UPDATE test SET id = 1;
> ls -lah /var/lib/postgresql/data/pg_xact
total 16K
drwx------  2 postgres postgres 4.0K Feb 11 12:45 .
drwx------ 19 postgres postgres 4.0K Feb 11 12:45 ..
-rw-------  1 postgres postgres 8.0K Feb 11 12:45 0000
> export PGPASSWORD=postgres; pgbench -h localhost -p 5432 -U postgres -c 80
-j 10 -t 15000 -f test.sql postgres

(pgbench generates 1_200_000 transactions in total, an empirically found
number that completely fills pg_xact/0000 and leads to pg_xact/0001
creation)

5. flush changes from shared buffered to disk (backend 3):

postgres=# CHECKPOINT;
CHECKPOINT

6. verify that pg_xact/0001 is created:

> ls -lah /var/lib/postgresql/data/pg_xact
total 304K
drwx------  2 postgres postgres 4.0K Feb 11 13:00 .
drwx------ 19 postgres postgres 4.0K Feb 11 12:45 ..
-rw-------  1 postgres postgres 256K Feb 11 13:00 0000
-rw-------  1 postgres postgres  40K Feb 11 13:00 0001

7. execute VACUUM FREEZE on each database at the server to freeze rows and
purge pg_xact/0000 (backend 4):

postgres=# VACUUM FREEZE;
VACUUM

postgres=# \c template1
You are now connected to database "template1" as user "postgres".
template1=# VACUUM FREEZE;
VACUUM

template1=# ALTER DATABASE template0 WITH ALLOW_CONNECTIONS true;
ALTER DATABASE
template1=# \c template0
You are now connected to database "template0" as user "postgres".
template0=# VACUUM FREEZE;
VACUUM

template0=# select datname, datfrozenxid from pg_database;
  datname  | datfrozenxid
-----------+--------------
 postgres  |      1200736
 template0 |      1200737
 template1 |      1200736
(3 rows)

postgres=# \c template1
You are now connected to database "template1" as user "postgres".
template1=# VACUUM FREEZE;
VACUUM

8. ensure that pg_xact/0000 is gone

> ls -lah /var/lib/postgresql/data/pg_xact
total 48K
drwx------  2 postgres postgres 4.0K Feb 11 15:23 .
drwx------ 19 postgres postgres 4.0K Feb 11 12:45 ..
-rw-------  1 postgres postgres  40K Feb 11 13:00 0001

9. try to listen to test_chan (backend 5):

postgres=# LISTEN test_chan;
ERROR:  could not access status of transaction 733
DETAIL:  Could not open file "pg_xact/0000": No such file or directory.

10. ensure that it is impossible to listen to any otherchannel (backend
6):

postgres=# LISTEN another_test_chan;
ERROR:  could not access status of transaction 733
DETAIL:  Could not open file "pg_xact/0000": No such file or directory

Expected behaviour: 

I should be able to listen to any channel in the database after Step 7

NOTE:

I have tried the same steps on the latest minor release for my major version
(16.6, to be precise) and the issue still persists.


Hi! Just kind reminder about bug report. I'm still waiting for some response

On Tue, Feb 11, 2025 at 5:27 PM PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:

Bug reference:      18804
Logged by:          Andrei Varashen
Email address:      voroshen.av@gmail.com
PostgreSQL version: 16.3
Operating system:   Debian GNU/Linux 12 (bookworm)
Description:       

Hi there,

I've encountered the error while trying to register a listener with the
LISTEN channel_name statement in my production database:

STATEMENT: LISTEN river_leadership
ERROR: could not access status of transaction 2048841894
DETAIL: Could not open file "pg_xact/07A1": No such file or directory.

That also led to an error when attempting to register a listener for any
other channel in that database.

After some investigation, I found that similar issue has been reported
several times already, but none of the investigations cocluded with a
solution except for recommendation to restart postmaster in order to clean
up the LISTEN/NOTIFY queue (as far as I understood). This workaround works,
but the root cause of this behaviour remains unaddressed. Similar bug
reports:

1.
https://www.postgresql.org/message-id/16961-25f29f95b3604a8a%40postgresql.org
2.
https://www.postgresql.org/message-id/18394-e7459245148578b2@postgresql.org

I managed to reproduce the issue locally.

Pre-conditions:

1. postgres is deployed using docker. image:
https://hub.docker.com/layers/library/postgres/16.3/images/sha256-1b277d0af2273577d8547b295e3834baaf0c04f5b3b823882f686bac23502cf7

2. autovacuum is disabled (to avoid any automatic intervention), all other
properties use default values:

> cat postgres-test.conf | grep 'autovacuum = off'
autovacuum = off

3. test table is created in postgres database (used later to generate a high
number of update transactions):

postgres=# create table test(id int);
CREATE TABLE
postgres=# insert into test values (1);
INSERT 0 1


Steps to reproduce:

1. listen to the channel (backend 1):

postgres=# LISTEN test_chan;
LISTEN

2. notify the channel (backend 2):

postgres=# NOTIFY test_chan;
NOTIFY
postgres=# select txid_current();
 txid_current
--------------
          734
(1 row)

3. receive notification (backend 1):

postgres=# LISTEN test_chan;
LISTEN
Asynchronous notification "test_chan" received from server process with PID
86.

4. execute a simple update command on test table (from the pre-conditions
section) using pgbench in order to generate enough transactions to create
pg_xact/0001 file:

> pgbench --version                                                         


pgbench (PostgreSQL) 16.3 (Debian 16.3-1.pgdg120+1)
> cat test.sql
UPDATE test SET id = 1;
> ls -lah /var/lib/postgresql/data/pg_xact
total 16K
drwx------  2 postgres postgres 4.0K Feb 11 12:45 .
drwx------ 19 postgres postgres 4.0K Feb 11 12:45 ..
-rw-------  1 postgres postgres 8.0K Feb 11 12:45 0000
> export PGPASSWORD=postgres; pgbench -h localhost -p 5432 -U postgres -c 80
-j 10 -t 15000 -f test.sql postgres

(pgbench generates 1_200_000 transactions in total, an empirically found
number that completely fills pg_xact/0000 and leads to pg_xact/0001
creation)

5. flush changes from shared buffered to disk (backend 3):

postgres=# CHECKPOINT;
CHECKPOINT

6. verify that pg_xact/0001 is created:

> ls -lah /var/lib/postgresql/data/pg_xact
total 304K
drwx------  2 postgres postgres 4.0K Feb 11 13:00 .
drwx------ 19 postgres postgres 4.0K Feb 11 12:45 ..
-rw-------  1 postgres postgres 256K Feb 11 13:00 0000
-rw-------  1 postgres postgres  40K Feb 11 13:00 0001

7. execute VACUUM FREEZE on each database at the server to freeze rows and
purge pg_xact/0000 (backend 4):

postgres=# VACUUM FREEZE;
VACUUM

postgres=# \c template1
You are now connected to database "template1" as user "postgres".
template1=# VACUUM FREEZE;
VACUUM

template1=# ALTER DATABASE template0 WITH ALLOW_CONNECTIONS true;
ALTER DATABASE
template1=# \c template0
You are now connected to database "template0" as user "postgres".
template0=# VACUUM FREEZE;
VACUUM

template0=# select datname, datfrozenxid from pg_database;
  datname  | datfrozenxid
-----------+--------------
 postgres  |      1200736
 template0 |      1200737
 template1 |      1200736
(3 rows)

postgres=# \c template1
You are now connected to database "template1" as user "postgres".
template1=# VACUUM FREEZE;
VACUUM

8. ensure that pg_xact/0000 is gone

> ls -lah /var/lib/postgresql/data/pg_xact
total 48K
drwx------  2 postgres postgres 4.0K Feb 11 15:23 .
drwx------ 19 postgres postgres 4.0K Feb 11 12:45 ..
-rw-------  1 postgres postgres  40K Feb 11 13:00 0001

9. try to listen to test_chan (backend 5):

postgres=# LISTEN test_chan;
ERROR:  could not access status of transaction 733
DETAIL:  Could not open file "pg_xact/0000": No such file or directory.

10. ensure that it is impossible to listen to any otherchannel (backend
6):

postgres=# LISTEN another_test_chan;
ERROR:  could not access status of transaction 733
DETAIL:  Could not open file "pg_xact/0000": No such file or directory

Expected behaviour:

I should be able to listen to any channel in the database after Step 7

NOTE:

I have tried the same steps on the latest minor release for my major version
(16.6, to be precise) and the issue still persists.

On Tue, 2025-06-10 at 12:02 +0200, Andrei Varashen wrote:
> Hi! Just kind reminder about bug report. I'm still waiting for some response
>
> On Tue, Feb 11, 2025 at 5:27 PM PG Bug reporting form <noreply@postgresql.org> wrote:
> > The following bug has been logged on the website:
> >
> > Bug reference:      18804
> > Logged by:          Andrei Varashen
> > Email address:      voroshen.av@gmail.com
> > PostgreSQL version: 16.3
> > Operating system:   Debian GNU/Linux 12 (bookworm)
> > Description:       
> >
> > Hi there,
> >
> > I've encountered the error while trying to register a listener with the
> > LISTEN channel_name statement in my production database:
> >
> > STATEMENT: LISTEN river_leadership
> > ERROR: could not access status of transaction 2048841894
> > DETAIL: Could not open file "pg_xact/07A1": No such file or directory.

This is a duplicate of bug 16961:
https://www.postgresql.org/message-id/flat/16961-25f29f95b3604a8a%40postgresql.org

As far as I know, the problem happens when you are hold transactions
open for a very long time, so don't do that.

A restart of the database will fix the immediate problem.

Yours,
Laurenz Albe