Re: [HACKERS] Another speedup idea (two, even) - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: [HACKERS] Another speedup idea (two, even) |
Date | |
Msg-id | 199903151437.JAA12462@candle.pha.pa.us Whole thread Raw |
In response to | Another speedup idea (two, even) (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [HACKERS] Another speedup idea (two, even)
|
List | pgsql-hackers |
Tom, where are we on this? As I rememeber, you did this already, right? > printtup() does a SearchSysCache call for each attribute of each tuple > in order to find the appropriate output routine for the attribute's > type. (Up till yesterday it did *two* such calls per attribute, but > I fixed that.) This is fairly expensive, amounting to about 10% of > the runtime in the SELECT-a-large-table test case I'm looking at. > It's probably even more than that in cases that don't stress > heap_getattr as badly as this one does. > > It occurs to me that there's no good reason to do this lookup more > than once per column --- all the tuples in a relation should have > the same set of column types, no? So if we could do these lookups > once at the start of an output pass, and cache the results for use > in individual printtup calls, we could drive that 10% down to zero > at essentially no penalty. > > There are a couple different ways this could be handled. The way > that looks good to me at first glance is to extend the notion of > a "tuple destination" (as selected by DestToFunction in dest.c) > to include not just a per-tuple processing routine but also setup and > cleanup routines, and some storage accessible to all three routines. > The setup routine would be passed the TupleDesc info that is expected > to apply to all tuples subsequently sent to that destination, and it can > do nothing or do setup work for use by the per-tuple routine. What > we'd actually have it do for the printtup destination type is create > and fill in an array of per-column output function info. The cleanup > routine is for symmetry --- for this immediate issue all it would need > to do is free the data created by the setup routine, but I can imagine > new kinds of destinations that need more setup/cleanup someday. > > That gives us a place to precalculate the system cache search that > finds the type-specific output routine's OID. But as long as we are > precalculating stuff, it would also be worthwhile to precalculate the > info that fmgr.c needs in order to invoke the routine. For builtin > functions it seems to me that we ought to be able to reduce the > per-tuple call effort to a straight jump through a function pointer, > which would save almost another 10% of SELECT's runtime. Even for > non-builtins, finding out that it's not a builtin once per select > instead of once per tuple would be helpful. > > This last idea could perhaps be combined with the revision of the > function manager interface that some folks have been muttering about > for a while (ie, fix its deficiencies w.r.t. null parameter values). > > I think we're too close to 6.5 beta to start hacking on a function > manager refit, but maybe the tuple destination improvement could be > done in time for 6.5? > > regards, tom lane > > -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
pgsql-hackers by date: