Another speedup idea (two, even) - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Another speedup idea (two, even) |
Date | |
Msg-id | 16260.917226362@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: [HACKERS] Another speedup idea (two, even)
Re: [HACKERS] Another speedup idea (two, even) |
List | pgsql-hackers |
printtup() does a SearchSysCache call for each attribute of each tuple in order to find the appropriate output routine for the attribute's type. (Up till yesterday it did *two* such calls per attribute, but I fixed that.) This is fairly expensive, amounting to about 10% of the runtime in the SELECT-a-large-table test case I'm looking at. It's probably even more than that in cases that don't stress heap_getattr as badly as this one does. It occurs to me that there's no good reason to do this lookup more than once per column --- all the tuples in a relation should have the same set of column types, no? So if we could do these lookups once at the start of an output pass, and cache the results for use in individual printtup calls, we could drive that 10% down to zero at essentially no penalty. There are a couple different ways this could be handled. The way that looks good to me at first glance is to extend the notion of a "tuple destination" (as selected by DestToFunction in dest.c) to include not just a per-tuple processing routine but also setup and cleanup routines, and some storage accessible to all three routines. The setup routine would be passed the TupleDesc info that is expected to apply to all tuples subsequently sent to that destination, and it can do nothing or do setup work for use by the per-tuple routine. What we'd actually have it do for the printtup destination type is create and fill in an array of per-column output function info. The cleanup routine is for symmetry --- for this immediate issue all it would need to do is free the data created by the setup routine, but I can imagine new kinds of destinations that need more setup/cleanup someday. That gives us a place to precalculate the system cache search that finds the type-specific output routine's OID. But as long as we are precalculating stuff, it would also be worthwhile to precalculate the info that fmgr.c needs in order to invoke the routine. For builtin functions it seems to me that we ought to be able to reduce the per-tuple call effort to a straight jump through a function pointer, which would save almost another 10% of SELECT's runtime. Even for non-builtins, finding out that it's not a builtin once per select instead of once per tuple would be helpful. This last idea could perhaps be combined with the revision of the function manager interface that some folks have been muttering about for a while (ie, fix its deficiencies w.r.t. null parameter values). I think we're too close to 6.5 beta to start hacking on a function manager refit, but maybe the tuple destination improvement could be done in time for 6.5? regards, tom lane
pgsql-hackers by date: