Thread: Idea for cleaner representation of snapshots
I've always been annoyed by the implementation of HeapTupleSatisfiesVisibility (in src/include/utils/tqual.h): the normal case of an MVCC snapshot is the last one handled, and we can't expand the set of special cases without slowing it down even more. Also, as noted in tqual.h, the way we represent special snapshots is a gross violation of the rules for writing portable C. I had an idea about how to make this cleaner and faster at the same time. Make all snapshots be real pointers to SnapshotData structures (except InvalidSnapshot, which remains an alias for NULL). Add a struct field that is a pointer to the satifies function for the snapshot type. Then HeapTupleSatisfiesVisibility reduces to something like ((*(snapshot)->satisfies) ((tuple)->t_data, snapshot, buffer)) which ought to be faster than it is now. We could also add an "mvcc" bool field to simplify IsMVCCSnapshot() --- or just make it test the contents of the satisfies-function field. The names SnapshotNow etc would become names of suitably-initialized global variables (really global constants, since they'll never change). I'm half inclined to get rid of SnapshotDirty as a global variable, though: it's ugly because HeapTupleSatisfiesDirty writes into it, creating a non-reentrant API. Since the above API change causes HeapTupleSatisfiesDirty to get a pointer to the snapshot struct it's being used with, callers that need this could use a local snapshot variable to receive the output xmin/xmax. This would be cleaner than what we have in a couple of other ways too: in particular it would eliminate the gotcha that CopySnapshot/FreeSnapshot don't work on special snapshots. It might also simplify the idea we were kicking around on -patches of maintaining reference counts for snapshots; that's not gonna work for special snapshots either, unless they become real structs. Thoughts, objections? regards, tom lane
On Sat, Mar 24, 2007 at 02:00:30PM -0400, Tom Lane wrote: > I had an idea about how to make this cleaner and faster at > the same time. Make all snapshots be real pointers to > SnapshotData structures (except InvalidSnapshot, which remains > an alias for NULL). Add a struct field that is a pointer to > the satifies function for the snapshot type. Then > HeapTupleSatisfiesVisibility reduces to something like > > ((*(snapshot)->satisfies) ((tuple)->t_data, snapshot, buffer)) > > which ought to be faster than it is now. We could also add > an "mvcc" bool field to simplify IsMVCCSnapshot() --- or just > make it test the contents of the satisfies-function field. Sounds like a winner. Essentially snapshots becomes objects that have methods you can use to interact with them. Make a new shapshot type, most of the code doesn't need to care. So yep, good idea. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Martijn van Oosterhout <kleptog@svana.org> writes: > On Sat, Mar 24, 2007 at 02:00:30PM -0400, Tom Lane wrote: >> HeapTupleSatisfiesVisibility reduces to something like >> ((*(snapshot)->satisfies) ((tuple)->t_data, snapshot, buffer)) > Sounds like a winner. Essentially snapshots becomes objects that have > methods you can use to interact with them. Right, it's poor man's C++ ;-). I've just finished coding the patch and it looks good ... still testing though. regards, tom lane
> HeapTupleSatisfiesVisibility reduces to something like > > ((*(snapshot)->satisfies) ((tuple)->t_data, snapshot, buffer)) > > which ought to be faster than it is now. It sounds like a fine idea from the point of view of flexibility. But as far as faster... I guess it depends on how often HeapTupleSatisfiesVisibility is used in contexts where the compiler is able to optimize away the conditionals or the cpu is able to predict them accurately. I suspect in the cases where we actually care--scans where it's being called thousands of times quickly--the latter is quite effective. Function pointers are notoriously hard to optimize around and can actually make the surrounding code harder to optimize as well especially since we compile with -fno-strict-aliasing. So whether it's faster or slower may depend a lot on the specific call site. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
On Sat, Mar 24, 2007 at 09:37:07PM +0000, Gregory Stark wrote: > It sounds like a fine idea from the point of view of flexibility. But as far > as faster... I guess it depends on how often HeapTupleSatisfiesVisibility is > used in contexts where the compiler is able to optimize away the conditionals > or the cpu is able to predict them accurately. I suspect in the cases where we > actually care--scans where it's being called thousands of times quickly--the > latter is quite effective. Dunno, this test by someone suggests that calling through a function pointer doesn't cost anything at all, if anything it's faster. Though that may differ on other machines obviously. http://gcc.gnu.org/ml/gcc/2004-06/msg01308.html In any case, looking at the code I don't think the compiler can optimise HeapTupleSatisfiesVisibility at all ever, because it's never specified explicitly. > Function pointers are notoriously hard to optimize around and can actually > make the surrounding code harder to optimize as well especially since we > compile with -fno-strict-aliasing. So whether it's faster or slower may depend > a lot on the specific call site. The costs of -fno-strict-aliasing are probably measurable, but I don't think we can do anything about that... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
"Martijn van Oosterhout" <kleptog@svana.org> writes: > On Sat, Mar 24, 2007 at 09:37:07PM +0000, Gregory Stark wrote: > >> It sounds like a fine idea from the point of view of flexibility. But as far >> as faster... I guess it depends on how often HeapTupleSatisfiesVisibility is >> used in contexts where the compiler is able to optimize away the conditionals >> or the cpu is able to predict them accurately. I suspect in the cases where we >> actually care--scans where it's being called thousands of times quickly--the >> latter is quite effective. > > Dunno, this test by someone suggests that calling through a function > pointer doesn't cost anything at all, if anything it's faster. Though > that may differ on other machines obviously. > > http://gcc.gnu.org/ml/gcc/2004-06/msg01308.html You should read further in that thread. The results in that post are incorect and represent the function call being inlined differently. However even the fixed version shows a speed improvement because the assembly to call a function through a pointer is shorter. In any case my point wasn't that function pointers themselves were slow. But rather that a function call through one prevents other surrounding code from being optimized. Since the compiler usually can't know what function you're calling it can't know what variables it might touch. In addition to that it can force register spills. Personally I think worrying about micro-optimization like this is not really useful. I think the real argument is that it makes the code cleaner and more flexible if we want to add new types of snapshots or track more data about snapshots. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Gregory Stark <stark@enterprisedb.com> writes: > It sounds like a fine idea from the point of view of flexibility. But as far > as faster... I guess it depends on how often HeapTupleSatisfiesVisibility is > used in contexts where the compiler is able to optimize away the conditionals > or the cpu is able to predict them accurately. The compiler can't optimize away those conditionals, and even if the hardware can predict them accurately, one predicted jump has to be faster than five successive ones. The branch-prediction hardware works for function pointer calls too (else C++ would be seriously slow). regards, tom lane
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > Gregory Stark <stark@enterprisedb.com> writes: >> It sounds like a fine idea from the point of view of flexibility. But as far >> as faster... I guess it depends on how often HeapTupleSatisfiesVisibility is >> used in contexts where the compiler is able to optimize away the conditionals >> or the cpu is able to predict them accurately. > > The compiler can't optimize away those conditionals, and even if the > hardware can predict them accurately, one predicted jump has to be > faster than five successive ones. The branch-prediction hardware works > for function pointer calls too (else C++ would be seriously slow). That's not really the point. The problem is that the compiler usually can't deduce which function you're calling or even which set of functions you might be calling. So, for example, the compiler will have trouble determining which variables may be untouched over the function call and thus not have to be reloaded into registers. Again though, I really don't think it matters. It might be 1% faster or 1% slower but HeapSatisfiesVisibility is itself only going to be a small fraction of the total time and so even on a heavily cpu-bound test we're talking about fractions of a percent either way. If it lets us do one algorithmic improvement we couldn't otherwise do the gain from that flexibility down the line will be many orders of magnitude larger and more important than a small microptimization. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Gregory Stark <stark@enterprisedb.com> writes: > That's not really the point. The problem is that the compiler usually can't > deduce which function you're calling or even which set of functions you might > be calling. So, for example, the compiler will have trouble determining which > variables may be untouched over the function call and thus not have to be > reloaded into registers. Well, since the functions in question are in a separate module from their callers, I think it highly unlikely that any such optimizations would be applied in practice anyway. > Again though, I really don't think it matters. Agreed, it's unlikely this would be a significant change either way. regards, tom lane
I wrote: > Gregory Stark <stark@enterprisedb.com> writes: >> Again though, I really don't think it matters. > Agreed, it's unlikely this would be a significant change either way. Just for the record, pgbench numbers seem unaffected by this patch (on Fedora Core 6 x86_64). regards, tom lane