Neil Conway <neilc@samurai.com> writes:
> I've been taking a look at improving the performance of the GEQO
> code. Before looking at algorithmic improvements, I noticed some
> "low-hanging fruit": gprof revealed that newNode() was a hotspot. When
> I altered it to be an inline function, the overall time to plan a
> 12-table table join (using the default GEQO settings) dropped by about
> 9%.
How much did you bloat the code? There are an awful lot of calls to
newNode(), so even though it's not all that large, I'd think the
multiplier would be nasty.
It might be better to offer two versions, standard out-of-line and a
macro-or-inline called something like newNodeInline(), and change just
the hottest call sites to use the inline. You could probably get most
of the speedup from inlining at just a few call sites. (I've been
intending to do something similar with the TransactionId comparison
functions.)
> However, I'm not sure if I used the correct syntax for inlining the
> function (since it was originally declared in a header and defined
> elsewhere, I couldn't just add 'inline'). The method I used (declaring
> the function 'extern inline' and defining it in the header file) works
> for me with GCC 3.2, but I'm open to suggestions for improvement.
This isn't portable at all, AFAIK :-(. Unfortunately I can't think
of a portable way to do it with a macro, either.
However, if you're willing to go the route involving changing call
sites, then you could change
foo = newNode(typename);
to
newNodeMacro(foo, typename);
whereupon it becomes pretty easy to make a suitable macro.
regards, tom lane