Hi,
On 2023-01-10 15:03:42 +0100, Matthias van de Meent wrote:
> On Mon, 9 Jan 2023 at 20:34, Andres Freund <andres@anarazel.de> wrote:
> > On 2023-01-09 17:50:10 +0100, Matthias van de Meent wrote:
> > > Wouldn't it be enough to only fix the constructions in
> > > FullXidRelativeTo() and widen_snapshot_xid() (as attached, $topic does
> > > not occur with the patch), and (optionally) bump the first XID
> > > available for any cluster to (FirstNormalXid + 1) to retain the 'older
> > > than any running transaction' property?
> >
> > It's not too hard to fix in individual places, but I suspect that we'll
> > introduce the bug in future places without some more fundamental protection.
> >
> > Locally I fixed it by clamping vacuum_defer_cleanup_age to a reasonable value
> > in ComputeXidHorizons() and GetSnapshotData().
>
> I don't think that clamping the value with oldestXid (as seen in patch
> 0001, in GetSnapshotData) is right.
I agree that using oldestXid to clamp is problematic.
> It would clamp the value relative to the oldest frozen xid of all
> databases, which can be millions of transactions behind oldestXmin,
> and thus severely skew the amount of transaction's changes you keep on
> disk (that is, until oldestXid moves past 1000_000).
What precisely do you mean with "skew" here? Do you just mean that it'd take a
long time until vacuum_defer_cleanup_age takes effect? Somehow it sounds like
you might mean more than that?
I'm tempted to go with reinterpreting 64bit xids as signed. Except that it
seems like a mighty invasive change to backpatch.
Greetings,
Andres Freund