Re: 8.3devel slower than 8.2 under read-only load - Mailing list pgsql-hackers

From Tom Lane
Subject Re: 8.3devel slower than 8.2 under read-only load
Date
Msg-id 23854.1196037323@sss.pgh.pa.us
Whole thread Raw
In response to Re: 8.3devel slower than 8.2 under read-only load  ("Guillaume Smet" <guillaume.smet@gmail.com>)
Responses Re: 8.3devel slower than 8.2 under read-only load  ("Guillaume Smet" <guillaume.smet@gmail.com>)
Re: 8.3devel slower than 8.2 under read-only load  (Gregory Stark <stark@enterprisedb.com>)
Re: 8.3devel slower than 8.2 under read-only load  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
"Guillaume Smet" <guillaume.smet@gmail.com> writes:
> Sure, it's the same queries I posted earlier. My pgbench script is the
> following:
> BEGIN

> select libvil from vilsitelang where codelang='FRA' and codevil='LYO'
> select TL.motsclesmetatags, TL.descriptifmeta, TL.motcleoverture_l,
> TL.motcleoverture_c, TL.baselinetheme from themelang TL where
> TL.codeth = 'ASS' and TL.codelang = 'FRA'
> SELECT libvilpubwoo, codelang, codepays, petiteville FROM vilsite
> WHERE codevil = 'LYO'
> select libvil from vilsitelang where codelang='FRA' and codevil='LYO'

> END

I poked into this a bit, and it seems the extra overhead is all coming
from resolving the ambiguous "=" operators.  That didn't show up in my
test because my query had "int4_column = int4_const" which is an exact
match to a pg_operator entry.  But since your columns are varchar,
which doesn't have any operators of its own, we have to go through
oper_select_candidate(), which is noticeably slower than before.  The
slowdown seems to have two causes:

1. Datatype bloat: there are 58 "=" operators in pg_operator today,
versus 54 at the beginning of the year.  That's 7% more work right
there to sort through the additional operators.

2. Removal of pg_cast entries associated with explicit varchar
coercions: when there's not a pg_cast entry for the desired coercion,
find_coercion_pathway does a second catalog lookup to see if it
might be an array case.  That happens more often in this test case
than it did at the start of the year, because I got rid of pg_cast
entries that could be replaced by the generic CoerceViaIO mechanism.

I'm not sure how big a hit #2 really is.  Presumably the removal of the
redundant entries has some distributed savings associated with it, which
would partially counteract the extra lookup; but I don't have any tools
that can isolate the cost of those particular SearchSysCache calls out
of all the rest.  In any case, #2 is specific to varchar and text while
effect #1 is an issue for just about everything.

The cost of resolving ambiguous operators has been an issue for a long
time, of course, but it seems particularly bad in this case --- gprof
blames 37% of the runtime on oper_select_candidate().  It might be time
to think about caching the results of operator searches somehow.  Too
late for 8.3 though.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Andreas 'ads' Scherbaum
Date:
Subject: Re: quote_literal(integer) does not exist
Next
From: "Guillaume Smet"
Date:
Subject: Re: 8.3devel slower than 8.2 under read-only load