Steve Crawford sent me some profiling results for queries involving wide
tuples (hundreds of columns).
> Done, results attached. nocachegetattr seems to be the likely suspect.
Yipes, you can say that again.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
93.38 26.81 26.81 885688 0.03 0.03 nocachegetattr
0.00 0.00 1/885688 heapgettup [159]
0.00 0.00 1/885688 CatalogCacheComputeTupleHashValue [248]
0.00 0.00 5/885688 SearchCatCache [22]
13.40 0.00 442840/885688 ExecEvalVar [20]
13.40 0.00 442841/885688 printtup [12]
[11] 93.4 26.81 0.00 885688 nocachegetattr [11]
Half of the calls are coming from printtup(), which seems relatively easy
to fix.
/*
* send the attributes of this tuple
*/
for (i = 0; i < natts; ++i)
{
...
origattr = heap_getattr(tuple, i + 1, typeinfo, &isnull);
...
}
The trouble here is that in the presence of variable-width fields,
heap_getattr requires a linear scan over the tuple --- and so the total
time spent in it is O(N^2) in the number of fields.
What we could do is reinstitute heap_deformtuple() as the inverse of
heap_formtuple() --- but make it extract Datums for all the columns in
a single pass over the tuple. This would reduce the time in printtup()
from O(N^2) to O(N), which would pretty much wipe out that part of the
problem.
The other half of the calls are coming from ExecEvalVar, which is a
harder problem to solve, since those calls are scattered all over the
place. It's harder to see how to get them to share work. Any ideas
out there?
regards, tom lane