Re: MVCC catalog access - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: MVCC catalog access |
Date | |
Msg-id | CA+TgmoY3ALC3GERRC=-A6mWOh7ZYKUdGN9VhA20B_-2oj3nt2w@mail.gmail.com Whole thread Raw |
In response to | Re: MVCC catalog access (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: MVCC catalog access
Re: MVCC catalog access |
List | pgsql-hackers |
On Tue, Jul 2, 2013 at 9:02 AM, Andres Freund <andres@2ndquadrant.com> wrote: > Some things that might be worth changing when committing: > * Could you add a Assert(!RelationHasSysCache(relid)) to > RelationInvalidatesSnapshotsOnly? It's not unlikely that it will be > missed by the next person adding a syscache and that seems like it > could have ugly and hard to detect consequences. There's a cross-check in InitCatalogCache() for that very issue. > * maybe use bsearch(3) instead of open coding the binary search? We > already use it in the backend. I found comments elsewhere indicating that bsearch() was slower than open-coding it, so I copied the logic used for ScanKeywordLookup(). > * possibly paranoid, but I'd add a Assert to heap_beginscan_catalog or > GetCatalogSnapshot ensuring we're dealing with a catalog relation. The > consistency mechanisms in GetCatalogSnapshot() only work for those, so > there doesn't seem to be a valid usecase for non-catalog relations. It'll actually work find as things stand; it'll just take a new snapshot every time. I have a few ideas for getting rid of the remaining uses of SnapshotNow that I'd like to throw out there: - In pgrowlocks and pgstattuple, I think it would be fine to use SnapshotSelf instead of SnapshotNow. The only difference is that it includes changes made by the current command that wouldn't otherwise be visible until CommandCounterIncrement(). That, however, is not really a problem for their usage. - In genam.c and execUtils.c, we treat SnapshotNow as a kind of default snapshot. That seems like a crappy idea. I propose that we either set that pointer to NULL and let the server core dump if the snapshot doesn't get set or (maybe better) add a new special snapshot called SnapshotError which just errors out if you try to use it for anything, and initialize to that. - I'm not quite sure what to do about get_actual_variable_range(). Taking a new MVCC snapshot there seems like it might be pricey on some workloads. However, I wonder if we could use SnapshotDirty. Presumably, even uncommitted tuples affect the amount of index-scanning we have to do, so that approach seems to have some theoretical justification. But I'm worried there could be unfortunate consequences to looking at uncommitted data, either now or in the future. SnapshotSelf seems less likely to have that problem, but feels wrong somehow. - currtid_byreloid() and currtid_byrelname() use SnapshotNow as an argument to heap_get_latest_tid(). I don't know what these functions are supposed to be good for, but taking a new snapshot for every function call seems to guarantee that someone will find a way to use these functions as a poster child for how to brutalize PGXACT, so I don't particularly want to do that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: