Re: A few patches to clarify snapshot management, part 2 - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: A few patches to clarify snapshot management, part 2
Date
Msg-id 1c83dd8a-3756-404f-bf8a-4a03d9fd8059@iki.fi
Whole thread Raw
In response to Re: A few patches to clarify snapshot management, part 2  (Chao Li <li.evan.chao@gmail.com>)
Responses Re: A few patches to clarify snapshot management, part 2
List pgsql-hackers
On 19/12/2025 07:15, Chao Li wrote:
>> On Dec 19, 2025, at 07:30, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>>
>> Patch 0002: Split SnapshotData into separate structs for each kind of snapshot
>> >> This implements the long-standing TODO and splits SnapshotData up
>> into multiple structs. This makes it more clear which fields are
>> used with which kind of snapshot. For example, we now have
>> properly named fields for the XID arrays in historic snapshots.
>> Previously, they abused the 'xip' and 'subxip' arrays to mean
>> something different than what they mean in regular MVCC snapshots.
>> 
>> This introduces some new casts between Snapshots and the new
>> MVCCSnapshots. I struggled to decide which functions should use
>> the new MVCCSnapshot type and which should continue to use
>> Snapshot. It's a balancing act between code churn and where we
>> want to have casts. For example, PushActiveSnapshot() takes a
>> Snapshot argument, but assumes that it's an MVCC snapshot, so it
>> really should take MVCCSnapshot. But most of its callers have a
>> Snapshot rather than MVCCSnapshot. Another example: the executor
>> assumes that it's dealing with MVCC snapshots, so it would be
>> appropriate to use MVCCSnapshot for EState->es_snapshot. But it'd
>> cause a lot code churn and casts elsewhere. I think the patch is a
>> reasonable middle ground.
> > The core of 0002 is the new union Snapshot:
> ```
> +/*
> + * Generic union representing all kind of possible snapshots.  Some have
> + * type-specific structs.
> + */
> +typedef union SnapshotData
> +{
> +    SnapshotType snapshot_type; /* type of snapshot */
> +
> +    MVCCSnapshotData mvcc;
> +    DirtySnapshotData dirty;
> +    HistoricMVCCSnapshotData historic_mvcc;
> +    NonVacuumableSnapshotData nonvacuumable;
>   } SnapshotData;
> ```
> > And my big concern is here. This union definition looks unusual,
> snapshot_type shares the same space with real snapshot bodies, so
> that once snapshot is assigned to the union, that type info is lost,
> there would be no way to decide what exact snapshot is stored in
> SnapshotData.

Each of the structs, MVCCSnapshotData, DirtySnapshotData, etc., contain 
'snapshot_type' as the first field, so it's always available.

> I think SnapshotData should be a structure and utilize anonymous union technique:
> ```
> typedef struct SnapshotData
> {
>      SnapshotType snapshot_type; /* type of snapshot */
> 
>      union {
>         MVCCSnapshotData mvcc;
>         DirtySnapshotData dirty;
>         HistoricMVCCSnapshotData historic_mvcc;
>         NonVacuumableSnapshotData nonvacuumable;
>      }
> ```

Hmm, yeah, maybe that would be more clear. But then you cannot cast an 
"MVCCSnapshotData *" to "SnapshotData *", or vice versa.

> If you agree with this, then 0002 will need some rework. A set of helper micros/functions would need to be added,
e.g.InitDirtySnapshot(), DirtySnapshotGetSnapshot()...
 

Right, I feel those helper macros / functions would be excessive.

- Heikki




pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: A few patches to clarify snapshot management, part 2
Next
From: Kirill Reshke
Date:
Subject: Re: Allow GUC settings in CREATE SUBSCRIPTION CONNECTION to take effect