Simon Riggs wrote:
> On Sat, 2008-09-13 at 10:48 +0100, Florian G. Pflug wrote:
>
>> The main idea was to invert the meaning of the xid array in the snapshot
>> struct - instead of storing all the xid's between xmin and xmax that are
>> to be considering "in-progress", the array contained all the xid's >
>> xmin that are to be considered "completed".
>
>> The downside is that the size of the read-only snapshot is theoretically
>> unbounded, which poses a bit of a problem if it's supposed to live
>> inside shared memory...
>
> Why do it inverted? That clearly has problems.
Because it solves the problem of "sponteaously" apprearing XIDs in the
WAL. At least prior to 8.3 with virtual xids, a transaction might have
allocated it's xid long before actually writing anything to disk, and
therefore long before this XID ever shows up in the WAL. And with a
non-inverted snapshot such an XID would be considered to be "completed"
by transactions on the slave... So, one either needs to periodically log
a snapshot on the master or log XID allocations which both seem to cause
considerable additional load on the master. With an inverted snapshot,
it's sufficient to log the current RecentXmin - a values that is readily
available on the master, and therefore the cost amounts to just one
additional 4-byte field per xlog entry.
regards, Florian Pflug