Re: fixing old_snapshot_threshold's time->xid mapping - Mailing list pgsql-hackers

From Andres Freund
Subject Re: fixing old_snapshot_threshold's time->xid mapping
Date
Msg-id 20200416171441.chqp44hbrxyblz5q@alap3.anarazel.de
Whole thread Raw
In response to fixing old_snapshot_threshold's time->xid mapping  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: fixing old_snapshot_threshold's time->xid mapping
List pgsql-hackers
Hi,

On 2020-04-16 12:41:55 -0400, Robert Haas wrote:
> I'm starting a new thread for this, because the recent discussion of
> problems with old_snapshot_threshold[1] touched on a lot of separate
> issues, and I think it will be too confusing if we discuss all of them
> on one thread. Attached are three patches.

Cool.


> 0003 attempts to fix bugs in MaintainOldSnapshotTimeMapping() so that
> it produces a sensible mapping. I encountered and tried to fix two
> issues here:
> 
> First, as previously discussed, the branch that advances the mapping
> should not categorically do "oldSnapshotControl->head_timestamp = ts;"
> assuming that the head_timestamp is supposed to be the timestamp for
> the oldest bucket rather than the newest one. Rather, there are three
> cases: (1) resetting the mapping resets head_timestamp, (2) extending
> the mapping by an entry without dropping an entry leaves
> head_timestamp alone, and (3) overwriting the previous head with a new
> entry advances head_timestamp by 1 minute.

> Second, the calculation of the number of entries by which the mapping
> should advance is incorrect. It thinks that it should advance by the
> number of minutes between the current head_timestamp and the incoming
> timestamp. That would be correct if head_timestamp were the most
> recent entry in the mapping, but it's actually the oldest. As a
> result, without this fix, every time we move into a new minute, we
> advance the mapping much further than we actually should. Instead of
> advancing by 1, we advance by the number of entries that already exist
> in the mapping - which means we now have entries that correspond to
> times which are in the future, and don't advance the mapping again
> until those future timestamps are in the past.
> 
> With these fixes, I seem to get reasonably sensible mappings, at least
> in light testing.  I tried running this in one window with \watch 10:
> 
> select *, age(newest_xmin), clock_timestamp()  from
> pg_old_snapshot_time_mapping();
> 
> And in another window I ran:
> 
> pgbench -T 300 -R 10
> 
> And the age does in fact advance by ~600 transactions per minute.

I still think we need a way to test this without waiting for hours to
hit various edge cases. You argued against a fixed binning of
old_snapshot_threshold/100 arguing its too coarse. How about a 1000 or
so? For 60 days, the current max for old_snapshot_threshold, that'd be a
granularity of 01:26:24, which seems fine.  The best way I can think of
that'd keep current GUC values sensible is to change
old_snapshot_threshold to be float. Ugly, but ...?


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Next
From: Fujii Masao
Date:
Subject: Re: Race condition in SyncRepGetSyncStandbysPriority