Re: Query 4-5 times slower after ANALYZE - Mailing list pgsql-general

From Philippe Lang
Subject Re: Query 4-5 times slower after ANALYZE
Date
Msg-id E6A0649F1FBFA3408A37F505400E7AC215CDAB@email.attiksystem.ch
Whole thread Raw
In response to Query 4-5 times slower after ANALYZE  ("Philippe Lang" <philippe.lang@attiksystem.ch>)
Responses Re: Query 4-5 times slower after ANALYZE  (Bill Moran <wmoran@potentialtech.com>)
Re: Query 4-5 times slower after ANALYZE  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
pgsql-general-owner@postgresql.org wrote:
> In response to "Philippe Lang" <philippe.lang@attiksystem.ch>:
>>
>> I'm using Postgresql 8.3.6 under Freebsd 7.1.
>>
>> After a fresh restore of a customer dump (running version 8.2.7 at
>> the moment), a rather big query executes in about 30 seconds. As
>> soon as I run ANALYZE, it is instantly 4-5 times slower. I could
>> check that multiples times.
>>
>> Here is the EXPLAIN ANALYZE before the ANALYZE:
>>
http://www.attiksystem.ch/postgresql/query_slower_after_analyze/before
>> .txt
>>
>> And here the the EXPLAIN ANALYZE after the ANALYZE:
>>
http://www.attiksystem.ch/postgresql/query_slower_after_analyze/after.
>> txt
>>
>> Any idea what could be turned on/off in order not to have this
>> slowdown after the ANALYZE?
>
> I opened one of those links figuring I'd take a few minutes to see if
> I could muster up some advice ... and just started laughing ...
> definitely not the type of query that one can even understand in just
> a few minutes!
>
> Anyway, the real reason I posted -- I doubt if anyone will be able to
> make sense of a query plan that complex without the actual query, so
> you'll probably want to post it as well.

:) What? I thought you would read that like Neo was reading the
Matrix... :)

Bill, Gregory, Tomas and Sebastian, thanks for your answers. I have
tried everything your mentioned:

- increase seq_page_cost
- increase geqo_threshold
- increase join_collapse_limit
- increase from_collapse_limit

But it did not help (except disabling completely sequential scans), and
for a reason I think I understand better now: part of the query looks
like:

----------------
SELECT

c.id AS customer_id,
c.name AS customer_name,
d.id AS document_id,
d.number AS document_number,
d.vref AS document_vref,
dt.name AS type,
d.creation_date AS value_date

FROM documents AS d

LEFT JOIN payment_terms AS pt
ON d.payment_term_id = pt.id

INNER JOIN reminder_levels AS rl
ON d.reminder_level_id = rl.id

INNER JOIN document_types AS dt
ON d.document_type_id = dt.id

INNER JOIN projects AS p
ON d.project_id = p.id

INNER JOIN customers AS c
ON p.customer_id = c.id

WHERE d.reminder = 1

AND solde_po(CURRENT_DATE, c.id) > 0

AND d.creation_date <= CURRENT_DATE
----------------

The heavy part here is the "solde_po" call (at the end), which takes up
most CPU time. That's why scanning the customers table takes up so much
time. I imagine a small change in the way this table is scanned can have
enormous effects in the overall execution time, like when an sequential
scan is preferred over an index scan. Does that sound correct?

A small question here: solde_po is an SQL function (not PLPGSQL). Is it
"inlined" in the parent query before the whole query execution plan is
calculated? Or are they treated completely separately?

Philippe

P.S. Thanks for the link to "explain.depesz.com"! Great tool!







pgsql-general by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: PostgreSql with or without Plus?
Next
From: Tino Wildenhain
Date:
Subject: Re: sql transaction