Re: suboverflowed subtransactions concurrency performance optimize - Mailing list pgsql-hackers
From | Michael Paquier |
---|---|
Subject | Re: suboverflowed subtransactions concurrency performance optimize |
Date | |
Msg-id | Yo8q4r+3GAAgm5kd@paquier.xyz Whole thread Raw |
In response to | Re: suboverflowed subtransactions concurrency performance optimize (Andres Freund <andres@anarazel.de>) |
Responses |
Re: suboverflowed subtransactions concurrency performance optimize
|
List | pgsql-hackers |
On Tue, May 24, 2022 at 04:52:50PM -0700, Andres Freund wrote: > As is, this strikes me as dangerous. At the very least this ought to be > structured so it can have assertions verifying that the cache contents are > correct. Well, under USE_ASSERT_CHECKING we could force a recalculation of the loop itself before re-checking and sending the cached result, as one thing. > It's far from obvious that it is correct to me, fwiw. Potential issues: > > 1) The result of SubTransGetTopmostTransaction() can change between subsequent > calls. If TransactionXmin advances, the TransactionXmin cutoff can change > the result. It might be unreachable or harmless, but it's not obvious that > it is, and there's zero comments explaining why it is obvious. I am not sure to follow on this one. A change in the TransactionXmin cutoff does not change the result retrieved for parentXid from the SLRU layer, because the xid cached refers to a parent still running. > 2) xid wraparound. There's nothing forcing SubTransGetTopmostTransaction() to > be called regularly, so even if a backend isn't idle, the cache could just > get more and more outdated until hitting wraparound Hence, you mean that the non-regularity of the call makes it more exposed to an inconsistent result after a wraparound? > To me it also seems odd that we cache in SubTransGetTopmostTransaction(), but > not in SubTransGetParent(). I think it's at least as common to end up with > subtrans access via TransactionIdDidCommit(), which calls SubTransGetParent() > rather than SubTransGetTopmostTransaction()? Why is > SubTransGetTopmostTransaction() the correct layer for caching? Hmm. I recall thinking about this exact point but left it out of the caching to maintain a symmetry with the setter routine that does the same and reverse operation on those SLRUs. Anyway, one reason to not use SubTransGetParent() is that it may return an invalid XID which we'd better not cache depending on its use (say, a serialized transaction), and SubTransGetTopmostTransaction() looping around to we make sure to never hit this case looks like the correct path to do do. Well, we could also store nothing if an invalid parent is found, but then the previous argument about the symmetry of the routines would not apply. This would be beneficial about cases like the one at the top of the thread about SLRU caches when subxids are overflowing when referring to the same XID. The ODBC driver likes a lot savepoints, for example. > I tried to find a benchmark result for this patch upthread, without > success. Has there been validation this helps with anything? I have been studying that again, and you are right that I should have asked for much more here. A benchmark like what's presented upthread may show some benefits with the case of the same savepoint used across multiple queries, only if with a caching of SubTransGetParent(), with enough subxids exhausted to overflow the snapshots. It would be better to revisit that stuff, and the benefit is limited with only SubTransGetTopmostTransaction(). Point 2) is something I did not consider, and that's a good one. For now, it looks better to revert this part rather than tweak it post beta1. -- Michael
Attachment
pgsql-hackers by date: