"Guillaume Smet" <guillaume.smet@gmail.com> writes:
> Sure, it's the same queries I posted earlier. My pgbench script is the
> following:
> BEGIN
> select libvil from vilsitelang where codelang='FRA' and codevil='LYO'
> select TL.motsclesmetatags, TL.descriptifmeta, TL.motcleoverture_l,
> TL.motcleoverture_c, TL.baselinetheme from themelang TL where
> TL.codeth = 'ASS' and TL.codelang = 'FRA'
> SELECT libvilpubwoo, codelang, codepays, petiteville FROM vilsite
> WHERE codevil = 'LYO'
> select libvil from vilsitelang where codelang='FRA' and codevil='LYO'
> END
I poked into this a bit, and it seems the extra overhead is all coming
from resolving the ambiguous "=" operators. That didn't show up in my
test because my query had "int4_column = int4_const" which is an exact
match to a pg_operator entry. But since your columns are varchar,
which doesn't have any operators of its own, we have to go through
oper_select_candidate(), which is noticeably slower than before. The
slowdown seems to have two causes:
1. Datatype bloat: there are 58 "=" operators in pg_operator today,
versus 54 at the beginning of the year. That's 7% more work right
there to sort through the additional operators.
2. Removal of pg_cast entries associated with explicit varchar
coercions: when there's not a pg_cast entry for the desired coercion,
find_coercion_pathway does a second catalog lookup to see if it
might be an array case. That happens more often in this test case
than it did at the start of the year, because I got rid of pg_cast
entries that could be replaced by the generic CoerceViaIO mechanism.
I'm not sure how big a hit #2 really is. Presumably the removal of the
redundant entries has some distributed savings associated with it, which
would partially counteract the extra lookup; but I don't have any tools
that can isolate the cost of those particular SearchSysCache calls out
of all the rest. In any case, #2 is specific to varchar and text while
effect #1 is an issue for just about everything.
The cost of resolving ambiguous operators has been an issue for a long
time, of course, but it seems particularly bad in this case --- gprof
blames 37% of the runtime on oper_select_candidate(). It might be time
to think about caching the results of operator searches somehow. Too
late for 8.3 though.
regards, tom lane