On 19/12/2025 17:16, Andres Freund wrote:
> On 2025-12-19 01:30:05 +0200, Heikki Linnakangas wrote:
>> Patch 0002: Split SnapshotData into separate structs for each kind of
>> snapshot
>>
>> This implements the long-standing TODO and splits SnapshotData up into
>> multiple structs. This makes it more clear which fields are used with which
>> kind of snapshot. For example, we now have properly named fields for the XID
>> arrays in historic snapshots. Previously, they abused the 'xip' and 'subxip'
>> arrays to mean something different than what they mean in regular MVCC
>> snapshots.
>>
>> This introduces some new casts between Snapshots and the new MVCCSnapshots.
>> I struggled to decide which functions should use the new MVCCSnapshot type
>> and which should continue to use Snapshot. It's a balancing act between code
>> churn and where we want to have casts. For example, PushActiveSnapshot()
>> takes a Snapshot argument, but assumes that it's an MVCC snapshot, so it
>> really should take MVCCSnapshot. But most of its callers have a Snapshot
>> rather than MVCCSnapshot. Another example: the executor assumes that it's
>> dealing with MVCC snapshots, so it would be appropriate to use MVCCSnapshot
>> for EState->es_snapshot. But it'd cause a lot code churn and casts
>> elsewhere. I think the patch is a reasonable middle ground.
>
> Generally I think this is good. A few points:
> - I'd use a SnapshotBase type or such that then has the SnapshotType as a
> member. That way we can more cleanly add common fields if we need them
Ok. (Chao suggested that too)
> - I'd rename the fields for dirty snapshots, given how differently they're
> used for dirty snapshots, compared to anything else
Hmm. The downside is a little more code churn, also in extensions that
use DirtySnapshot. There are only few of them though, based on quick
grepping I believe only pglogical is affected, so I think you're right.
I renamed them to 'updating_xmin' and 'updating_xmax'.
> - I'm somewhat doubtful that it's rigth to restict active snapshots to plain
> mvcc ones. Why is that the right thing? What if a historic mvcc snapshot is
> supposed to be pushed? What if we introduce different types of snapshots in
> the future, e.g. CSN ones on the standby?
Rats! Historic MVCC snapshots can indeed be pushed to the active stack.
There is no core code that does that, but pglogical does it. This is
precisely the case that Noah pointed out earlier [1], when I added an
assertion against that in GetTransactionSnapshot(), and had to revert it.
Ok so we need to keep the support for that. That complicates the first
patches in this series a little, but it's not too bad.
[1]
https://www.postgresql.org/message-id/20250809222338.cc.nmisch%40google.com
>> Patch 0004: Add an explicit 'valid' flag to MVCCSnapshotData
>>
>> Makes it more clear when the "static" snapshots returned by
>> GetTransactionSnapshot(), GetLatestSnapshot() and GetCatalogSnapshot() are
>> valid.
>
> Given they're pointing to static memory location, won't that limit how much we
> can rely on valid? If something else builds a snapshot a "wrong" reference to
> a snapshot will suddenly appear valid again, no?
It doesn't catch everything, but it catches more misuse than without it.
For example, this would go unnoticed on 'master', but would hit the new
assertion on the 'valid' flag:
snapshot = GetTransactionSnapshot();
PushActiveSnapshot(snapshot);
/* this invalidates the 'snapshot' and advances the advertised xmin */
PopActiveSnapshot();
/* This is wrong! 'snapshot' is invalid */
PushActiveSnapshot(snapshot);
But this mistake goes unnoticed with or without these patches:
snapshot = GetTransactionSnapshot();
PushActiveSnapshot(snapshot);
/* this invalidates the 'snapshot' and advances the advertised xmin */
PopActiveSnapshot();
other_snapshot = GetTransactionSnapshot();
/*
* This "works", because 'snapshot' and 'other_snapshot' both point to
* the same static SnapshotData. But it makes the *new* snapshot active,
* not the first one!
*/
PushActiveSnapshot(snapshot);
Here's a new patch set. I only included the more mature patches for now,
the rest need to be rebased.
The first patch is new. It adds sanity checks to a few places that
previously assumed that they were dealing with a regular MVCC snapshot
without checking.
Patch 0002 now accepts historic MVCC snapshots in PushActiveSnapshot().
I tested that it works with pglogical now, after updating pglogical
renamed fields and such.
- Heikki