Re: MVCC catalog access - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: MVCC catalog access |
Date | |
Msg-id | CA+Tgmoaug2vg569gYhJd7Q7bb8NEL=SB51V=bGak2p5gp+mR0A@mail.gmail.com Whole thread Raw |
In response to | Re: MVCC catalog access (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: MVCC catalog access
|
List | pgsql-hackers |
On Tue, Jul 2, 2013 at 9:52 AM, Andres Freund <andres@2ndquadrant.com> wrote: >> > * possibly paranoid, but I'd add a Assert to heap_beginscan_catalog or >> > GetCatalogSnapshot ensuring we're dealing with a catalog relation. The >> > consistency mechanisms in GetCatalogSnapshot() only work for those, so >> > there doesn't seem to be a valid usecase for non-catalog relations. >> >> It'll actually work find as things stand; it'll just take a new >> snapshot every time. > > Ok. Doesn't really change my opinion that it's a crappy idea to use it > otherwise ;) I agree, but I don't see an easy way to write the assertion you want using only the OID. >> - In genam.c and execUtils.c, we treat SnapshotNow as a kind of >> default snapshot. That seems like a crappy idea. I propose that we >> either set that pointer to NULL and let the server core dump if the >> snapshot doesn't get set or (maybe better) add a new special snapshot >> called SnapshotError which just errors out if you try to use it for >> anything, and initialize to that. > > I vote for SnapshotError. OK. >> - I'm not quite sure what to do about get_actual_variable_range(). >> Taking a new MVCC snapshot there seems like it might be pricey on some >> workloads. However, I wonder if we could use SnapshotDirty. >> Presumably, even uncommitted tuples affect the amount of >> index-scanning we have to do, so that approach seems to have some >> theoretical justification. But I'm worried there could be unfortunate >> consequences to looking at uncommitted data, either now or in the >> future. SnapshotSelf seems less likely to have that problem, but >> feels wrong somehow. > > I don't like using SnapshotDirty either. Can't we just use the currently > active snapshot? Unless I miss something this always will be called > while we have one since when we plan we've done an explicit > PushActiveSnapshot() and if we need to replan stuff during execution > PortalRunSelect() will have pushed one. We could certainly do that, but I'd be a bit reluctant to do so without input from Tom. I imagine there might be cases where it could cause a regression. >> - currtid_byreloid() and currtid_byrelname() use SnapshotNow as an >> argument to heap_get_latest_tid(). I don't know what these functions >> are supposed to be good for, but taking a new snapshot for every >> function call seems to guarantee that someone will find a way to use >> these functions as a poster child for how to brutalize PGXACT, so I >> don't particularly want to do that. > > Heikki mentioned that at some point they were added for the odbc > driver. I am not particularly inclined to worry about taking too many > snapshots here. Well, if it uses it with any regularity I think it's a legitimate concern. We have plenty of customers who use ODBC and some of them allow frightening numbers of concurrent server connections. Now you may say that's a bad idea, but so is 1000 backends doing txid_current() in a loop. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: