Re: CSN snapshots in hot standby - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: CSN snapshots in hot standby |
Date | |
Msg-id | swzptfywx4i2jj65oi3v3atsiywho3qzr5rj4rjr7ktzpxinmh@nwoaps6qvuqy Whole thread Raw |
In response to | Re: CSN snapshots in hot standby (Heikki Linnakangas <hlinnaka@iki.fi>) |
List | pgsql-hackers |
Hi, On 2024-08-13 23:13:39 +0300, Heikki Linnakangas wrote: > I added a tiny cache of the CSN lookups into SnapshotData, which can hold > the values of 4 XIDs that are known to be visible to the snapshot, and 4 > invisible XIDs. This is pretty arbitrary, but the idea is to have something > very small to speed up the common cases that 1-2 XIDs are repeatedly looked > up, without adding too much overhead. > > > I did some performance testing of the visibility checks using these CSN > snapshots. The tests run SELECTs with a SeqScan in a standby, over a table > where all the rows have xmin/xmax values that are still in-progress in the > primary. > > Three test scenarios: > > 1. large-xact: one large transaction inserted all the rows. All rows have > the same XMIN, which is still in progress > > 2. many-subxacts: one large transaction inserted each row in a separate > subtransaction. All rows have a different XMIN, but they're all > subtransactions of the same top-level transaction. (This causes the subxids > cache in the proc array to overflow) > > 3. few-subxacts: All rows are inserted, committed, and vacuum frozen. Then, > using 10 in separate subtransactions, DELETE the rows, in an interleaved > fashion. The XMAX values cycle like this "1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, > 2, 3, 4, 5, ...". The point of this is that these sub-XIDs fit in the > subxids cache in the procarray, but the pattern defeats the simple 4-element > cache that I added. I'd like to see some numbers for a workload with many overlapping top-level transactions. I contrast to 2) HEAD wouldn't need to do subtrans lookups, whereas this patch would need to do csn lookups. And a four entry cache probably wouldn't help very much. > +/* > + * Record commit LSN of a transaction and its subtransaction tree. > + * > + * xid is a single xid to set status for. This will typically be the top level > + * transaction ID for a top level commit. > + * > + * subxids is an array of xids of length nsubxids, representing subtransactions > + * in the tree of xid. In various cases nsubxids may be zero. > + * > + * commitLsn is the LSN of the commit record. This is currently never called > + * for aborted transactions. > + */ > +void > +CSNLogSetCSN(TransactionId xid, int nsubxids, TransactionId *subxids, > + XLogRecPtr commitLsn) > +{ > + int pageno; > + int i = 0; > + int offset = 0; > + > + Assert(TransactionIdIsValid(xid)); > + > + pageno = TransactionIdToPage(xid); /* get page of parent */ > + for (;;) > + { > + int num_on_page = 0; > + > + while (i < nsubxids && TransactionIdToPage(subxids[i]) == pageno) > + { > + num_on_page++; > + i++; > + } Hm - is there any guarantee / documented requirement that subxids is sorted? > + CSNLogSetPageStatus(xid, > + num_on_page, subxids + offset, > + commitLsn, pageno); > + if (i >= nsubxids) > + break; > + > + offset = i; > + pageno = TransactionIdToPage(subxids[offset]); > + xid = InvalidTransactionId; > + } > +} Hm. Maybe I'm missing something, but what prevents a concurrent transaction to check the visibility of a subtransaction between marking the subtransaction committed and marking the main transaction committed? If subtransaction and main transaction are on the same page that won't be possible, but if they are on different ones it does seem possible? Today XidInMVCCSnapshot() will use pg_subtrans to find the top transaction in case of a suboverflowed snapshot, but with this patch that's not the case anymore. Which afaict will mean that repeated snapshot computations could give different results for the same query? Greetings, Andres Freund
pgsql-hackers by date: