Re: Why is hot_standby_feedback off by default? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Why is hot_standby_feedback off by default?
Date
Msg-id 8800FAF9-22B0-46AC-817D-9DD6FA08A45D@anarazel.de
Whole thread Raw
In response to Re: Why is hot_standby_feedback off by default?  (Vik Fearing <vik@postgresfriends.org>)
Responses Re: Why is hot_standby_feedback off by default?  (sirisha chamarthi <sirichamarthi22@gmail.com>)
Re: Why is hot_standby_feedback off by default?  (Nathan Bossart <nathandbossart@gmail.com>)
List pgsql-hackers
Hi,

On October 22, 2023 4:56:15 AM PDT, Vik Fearing <vik@postgresfriends.org> wrote:
>On 10/22/23 09:50, sirisha chamarthi wrote:
>> Is there any specific reason hot_standby_feedback default is set to off?
>
>
>Yes.  No one wants a rogue standby to ruin production.

Medium term, I think we need an approximate xid->"time of assignment" mapping that's continually maintained on the primary. One of the things that'd show us to do is introduce a GUC to control the maximum effect of hs_feedback  on the primary, in a useful unit. Numbers of xids are not a useful unit (100k xids is forever on some systems, a few minutes at best on others, the rate is not necessarily that steady when plpgsql exception handles are used, ...)

It'd be useful to have such a mapping for other features too. E.g.

- making it visible in pg_stat _activity how problematic a longrunning xact is - a 3 day old xact that doesn't have an xid assigned and has a recent xmin is fine, it won't prevent vacuum from doing things. But a somewhat recent xact that still has a snapshot from before an old xact was cancelled could be problematic.

- turn pg_class.relfrozenxid into an understandable timeframe. It's a fair bit of mental effort to classify "370M xids old" into problem/fine (it's e.g. not a problem on a system with a high xid rate, on a big table that takes a bit to a bit to vacuum).

- using the mapping to compute an xid consumption rate IMO would be one building block for smarter AV scheduling. Together with historical vacuum runtimes it'd allow us to start vacuuming early enough to prevent hitting thresholds, adapt pacing, prioritize between tables etc.

Greetings,

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: Remove extraneous break condition in logical slot advance function
Next
From: Thomas Munro
Date:
Subject: Re: Guiding principle for dropping LLVM versions?