Home > mailing lists

Re: Long running query causing XID limit breach - Mailing list pgsql-general

From	sud
Subject	Re: Long running query causing XID limit breach
Date	May 26, 2024 09:15:50
Msg-id	CAD=mzVUzDD_XDa+BJ8fH96pEwgTG1NHUvaoyVePLhp3xEN=J9A@mail.gmail.com Whole thread
In response to	Re: Long running query causing XID limit breach (Torsten Förtsch <tfoertsch123@gmail.com>)
Responses	Re: Long running query causing XID limit breach
List	pgsql-general

Tree view

On Sun, May 26, 2024 at 1:43 PM Torsten Förtsch <tfoertsch123@gmail.com> wrote:

On Sat, May 25, 2024 at 11:00 PM sud <suds1434@gmail.com> wrote:

But i have one question here , does max_standby_streaming_delay = 14 , means the queries on the standby will get cancelled after 14 seconds?

No, your query gets cancelled when it stalls replication for >14 sec. If your master is idle and does not send any WAL and the replica has caught up, the query can take as long as it wants.

Thank you so much.

For example , in below scenario,

if i have insert query going on on primary instance on table 25th may partition of TABLE1, and at same time we are selecting data from 24th May partition , then with "max_standby_streaming_delay = 14" setup , it just allows the select query to run for any duration without any restriction even if the WAL gets applied on the standby regularly. Also INSERT query in primary won't make the standby SELECT queries to cancel as because the WAL record of INSERT queries on the primary instance is not conflicting to the exact rows those were being read by the standby. Is my understanding correct here?

However, if i have Update/Delete query going on on primary instance on table 25th may partition of TABLE1 and on the exact same set of rows which were being read by the standby instance by the SELECT query, then the application of such WAL record to standby can max wait for 14 seconds and thus those select query are prone to be cancelled after 14 seconds. Is this understanding correct?

If the above is true then it doesn't look good, as because in an OLTP system there will be a lot of DMLS happening on the writer instances and there may be many queries running on the reader/standby instances which are meant to run for hours. And if we say making those SELECT queries run for hours means compromising an hour of "high availability"/RPO or a lag of an hour between primary and standby , that doesn't look good. Please correct me if I am missing something here.

pgsql-general by date:

From: Torsten Förtsch
Date: 26 May 2024, 08:13:28
Subject: Re: Long running query causing XID limit breach

From: Victor Dobrovolsky
Date: 26 May 2024, 13:20:10
Subject: scalar plpgsql functions and their stability flags

Re: Long running query causing XID limit breach - Mailing list pgsql-general

Previous

Next