Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue - Mailing list pgsql-hackers
From | Matheus Alcantara |
---|---|
Subject | Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue |
Date | |
Msg-id | CAFY6G8cnDOMZCfSvc62vER_9+HOs4EM52XromhOZki+=Ytv2sg@mail.gmail.com Whole thread Raw |
In response to | Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue (Daniil Davydov <3danissimo@gmail.com>) |
Responses |
Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue |
List | pgsql-hackers |
On Wed Aug 13, 2025 at 4:29 PM -03, Daniil Davydov wrote: > Hi, > > On Mon, Aug 11, 2025 at 8:41 PM Matheus Alcantara > <matheusssilv97@gmail.com> wrote: >> >> On Wed Aug 6, 2025 at 7:44 AM -03, Álvaro Herrera wrote: >> >> My questions: >> >> >> >> 1. Is it acceptable to drop notifications from the async queue if >> >> there are no active listeners? There might still be notifications that >> >> haven’t been read by any previous listener. >> > >> > I'm somewhat wary of this idea -- could these inactive listeners become >> > active later and expect to be able to read their notifies? >> > >> I'm bit worry about this too. > > What exactly do we mean by "active listener"? According to the source code, > the active listener (as far as I understand) is the one who listens to at least > one channel. If we have no active listeners in the database, the new listener > will set its pointer to the tail of the async queue. Thus, messages with old > xid will not be touched by anybody. I don't see any point in dropping them > in this case. > I think that this definition is correct, but IIUC the tail can still have notifications with xid's that were already truncated by vacuum freeze. When the LISTEN is executed, we first loop through the notification queue to try to advance the queue pointers and we can eventually iterate over a notification that was added on the queue without any listener but it has a xid that is already truncated by vacuum freeze, so in this case it will fail to get the transaction status. On Alex steps to reproduce the issue it first executes the NOTIFY and then executes the LISTEN which fails after vacuum freeze. > If the "inactive" listener is the backend which is stuck somewhere, the > answer is "no" - this backend should be able to process all notifications. > I tried to reproduce the issue by using some kind of "inactive" listener but so far I didn't manage to trigger the error. This is what I tried: 1. Create listener: postgres=# listen c1; 2. Execute a very long query to make the backend busy to process the notification: postgres=# select * from generate_series(1,10000000000) g where g > 1; 3. On another session send the notification postgres=# notify c1; 4. Execute pgbench test: pgbench -n -c 80 -j 10 -t 15000 -f test.sql postgres 5. Verify that we have multiple files on pg_xact: ➜ ls -lah ~/pg-devel/data/pg_xact total 608 -rw-------@ 1 matheus staff 256K Aug 18 20:36 0000 -rw-------@ 1 matheus staff 40K Aug 18 20:56 0001 6. Execute VACUUM FREEZE on every database on the server postgres=# VACUUM FREEZE; VACUUM postgres=# \c template1 You are now connected to database "template1" as user "postgres". template1=# VACUUM FREEZE; VACUUM template1=# \c template0 You are now connected to database "template0" as user "postgres". template0=# VACUUM FREEZE; VACUUM After the vacuum freeze I still can see the same files on pg_xact/ and if I cancel the long query the notification is received correctly, and then if I execute vacuum freeze again on every database the oldest pg_xact file is truncated. So, if my tests are correct I don't think that storing the oldest xid is necessary anymore since I don't think that we can lose notifications using the patch from Daniil or I'm missing something here? Thinking about this, maybe another solution for this would be to change queue advancing pointers to skip the transaction status check? With this we would not need to touch any vacuum freeze code and instead change the LISTEN execution to handle this scenario. Thoughts? -- Matheus Alcantara
pgsql-hackers by date: