Thread: Query runs in 335ms; function in 100,239ms : date problem?

Query runs in 335ms; function in 100,239ms : date problem?

From
Rory Campbell-Lange
Date:
I have a function wrapping a (fairly complex) query.

The query itself runs in about a 1/3rd of a second. When running the
query as a 'RETURN QUERY' function on Postgres 8.4, the function runs in
over 100 seconds, about 300 times slower.

The function takes 3 input parameters: 2 dates and a boolean. The dates
(in_date_from, in_date_to) are used several times in the function.

When I replace the two parameters in the body of the query with, for
instance date'2011-05-01' and date'2011-08-01', the function operates
almost as speedily as the straight query.

I would be grateful to know how to work around this date problem.

As the function is rather large I have taken the liberty of posting it
here:
http://campbell-lange.net/media/files/fn_report_pers_leave.sql.html

Rory


Re: Query runs in 335ms; function in 100,239ms : date problem?

From
"Tomas Vondra"
Date:
On 5 Září 2011, 23:07, Rory Campbell-Lange wrote:
> I have a function wrapping a (fairly complex) query.
>
> The query itself runs in about a 1/3rd of a second. When running the
> query as a 'RETURN QUERY' function on Postgres 8.4, the function runs in
> over 100 seconds, about 300 times slower.
>
> The function takes 3 input parameters: 2 dates and a boolean. The dates
> (in_date_from, in_date_to) are used several times in the function.
>
> When I replace the two parameters in the body of the query with, for
> instance date'2011-05-01' and date'2011-08-01', the function operates
> almost as speedily as the straight query.
>
> I would be grateful to know how to work around this date problem.
>
> As the function is rather large I have taken the liberty of posting it
> here:
> http://campbell-lange.net/media/files/fn_report_pers_leave.sql.html

Do I understand correctly that you compare a query with literal parameters
with a parametrized query wrapped in a plpgsql function?

Try to run it as a prepared query - I guess you'll get about the same run
time as with the function (i.e. over 100 seconds).

The problem with prepared statements is that when planning the query, the
parameter values are unknown - so the optimizer does not know selectivity
of the conditions etc. and uses "common" values to prepare a safe plan.
OTOH the literal parameters allow to optimize the plan according to the
actual parameter values.

Tomas


Re: Query runs in 335ms; function in 100,239ms : date problem?

From
Rory Campbell-Lange
Date:
On 05/09/11, Tomas Vondra (tv@fuzzy.cz) wrote:
> On 5 Zá??í 2011, 23:07, Rory Campbell-Lange wrote:
...
> > The query itself runs in about a 1/3rd of a second. When running the
> > query as a 'RETURN QUERY' function on Postgres 8.4, the function runs in
> > over 100 seconds, about 300 times slower.
...
> > As the function is rather large I have taken the liberty of posting it
> > here:
> > http://campbell-lange.net/media/files/fn_report_pers_leave.sql.html
>
> Do I understand correctly that you compare a query with literal parameters
> with a parametrized query wrapped in a plpgsql function?

Yes! Certainly I need to make the function perform more quickly.

> Try to run it as a prepared query - I guess you'll get about the same run
> time as with the function (i.e. over 100 seconds).

The prepared query runs in almost exactly the same time as the function,
but thanks for the suggestion. A very useful aspect of it is that I was
able to get the EXPLAIN output which I guess gives a fairly good picture
of the plan used for the function.

The explain output is here:
http://campbell-lange.net/media/files/explain.txt.html

I'm inexperienced in reading EXPLAIN output, but it looks like the
Nested Loop Semi Join at line 72 is running very slowly.

> The problem with prepared statements is that when planning the query, the
> parameter values are unknown - so the optimizer does not know selectivity
> of the conditions etc. and uses "common" values to prepare a safe plan.
> OTOH the literal parameters allow to optimize the plan according to the
> actual parameter values.

Thank you very much for the helpful explanation.

Regards
Rory

Re: Query runs in 335ms; function in 100,239ms : date problem?

From
Rory Campbell-Lange
Date:
On 05/09/11, Rory Campbell-Lange (rory@campbell-lange.net) wrote:
> On 05/09/11, Tomas Vondra (tv@fuzzy.cz) wrote:
> > On 5 Zá??í 2011, 23:07, Rory Campbell-Lange wrote:
> ...
> > > The query itself runs in about a 1/3rd of a second. When running the
> > > query as a 'RETURN QUERY' function on Postgres 8.4, the function runs in
> > > over 100 seconds, about 300 times slower.

...

> > Try to run it as a prepared query - I guess you'll get about the same run
> > time as with the function (i.e. over 100 seconds).
>
> The prepared query runs in almost exactly the same time as the function,
> but thanks for the suggestion. A very useful aspect of it is that I was
> able to get the EXPLAIN output which I guess gives a fairly good picture
> of the plan used for the function.
>
> The explain output is here:
> http://campbell-lange.net/media/files/explain.txt.html
>
> I'm inexperienced in reading EXPLAIN output, but it looks like the
> Nested Loop Semi Join at line 72 is running very slowly.

I added in more filtering conditions to the clause at line 72 and the
prepared statement dropped in runtime to 24043.902 ms. Unfortunately the
function ran slower -- 47957.796 -- but even that is a 50% improvement.

Thanks very much for your help.

Regards
Rory

Re: Query runs in 335ms; function in 100,239ms : date problem?

From
"Tomas Vondra"
Date:
On 6 Září 2011, 0:27, Rory Campbell-Lange wrote:
> On 05/09/11, Tomas Vondra (tv@fuzzy.cz) wrote:
>> Do I understand correctly that you compare a query with literal
>> parameters
>> with a parametrized query wrapped in a plpgsql function?
>
> Yes! Certainly I need to make the function perform more quickly.
>
>> Try to run it as a prepared query - I guess you'll get about the same
>> run
>> time as with the function (i.e. over 100 seconds).
>
> The prepared query runs in almost exactly the same time as the function,
> but thanks for the suggestion. A very useful aspect of it is that I was
> able to get the EXPLAIN output which I guess gives a fairly good picture
> of the plan used for the function.

Well, my point was that the queries wrapped in functions are executed just
like prepared statements. And because prepared queries do not use
parameter values to optimize the plan, the result may be worse compared to
queries with literal parameters. So I was not expecting an improvement, I
was merely trying to show the problem.

> The explain output is here:
> http://campbell-lange.net/media/files/explain.txt.html
>
> I'm inexperienced in reading EXPLAIN output, but it looks like the
> Nested Loop Semi Join at line 72 is running very slowly.

I've posted the plan here: http://explain.depesz.com/s/uYX

Yes, the nested loop is the problem. One of the problems is that the join
condition - can you rewrite this

AND r.d_date || '-' || r.n_session || '-' || u.n_id IN
    (SELECT
        d_date || '-' || n_session || '-' || n_person
    FROM
        leave_association
    WHERE
        d_date >= in_date_from
        AND d_date <= in_date_to
    ) -- i.e. leave where the person normally works

like this

AND EXISTS
    (SELECT
        1
    FROM
        leave_association
    WHERE
        d_date >= in_date_from
        AND d_date <= in_date_to
        AND d_date = r.d_date
        AND n_session = r.n_session
        AND n_person = u.n_id
    ) -- i.e. leave where the person normally works

and then do the same for the NOT IN subquery (=> NOT EXISTS). I think it
should return the same results, plus it might use indexes on the
leave_association. That was not possible because of the join condition.

Tomas


Re: Query runs in 335ms; function in 100,239ms : date problem?

From
Torsten Zühlsdorff
Date:
Rory Campbell-Lange schrieb:

>>> Try to run it as a prepared query - I guess you'll get about the same run
>>> time as with the function (i.e. over 100 seconds).
>> The prepared query runs in almost exactly the same time as the function,
>> but thanks for the suggestion. A very useful aspect of it is that I was
>> able to get the EXPLAIN output which I guess gives a fairly good picture
>> of the plan used for the function.
>>
>> The explain output is here:
>> http://campbell-lange.net/media/files/explain.txt.html
>>
>> I'm inexperienced in reading EXPLAIN output, but it looks like the
>> Nested Loop Semi Join at line 72 is running very slowly.
>
> I added in more filtering conditions to the clause at line 72 and the
> prepared statement dropped in runtime to 24043.902 ms. Unfortunately the
> function ran slower -- 47957.796 -- but even that is a 50% improvement.

Also declare your function as STABLE. This can trigger an speed-increase.

Greetings,
Torsten
--
http://www.dddbl.de - ein Datenbank-Layer, der die Arbeit mit 8
verschiedenen Datenbanksystemen abstrahiert,
Queries von Applikationen trennt und automatisch die Query-Ergebnisse
auswerten kann.

Re: Query runs in 335ms; function in 100,239ms : date problem?

From
Tom Lane
Date:
"Tomas Vondra" <tv@fuzzy.cz> writes:
> On 6 Září 2011, 0:27, Rory Campbell-Lange wrote:
>> The prepared query runs in almost exactly the same time as the function,
>> but thanks for the suggestion. A very useful aspect of it is that I was
>> able to get the EXPLAIN output which I guess gives a fairly good picture
>> of the plan used for the function.

> Well, my point was that the queries wrapped in functions are executed just
> like prepared statements. And because prepared queries do not use
> parameter values to optimize the plan, the result may be worse compared to
> queries with literal parameters. So I was not expecting an improvement, I
> was merely trying to show the problem.

Right.  The actual fix is to use EXECUTE so you force a new plan to be
generated each time.  If you use EXECUTE USING to insert the parameter
values, you can avoid most of the notational mess this would otherwise
imply, as well as the risk of SQL-injection bugs from failing to quote
parameter values safely.

I'm hoping that 9.2 will be smart enough to not need such workarounds,
but that's where things stand at the moment.

            regards, tom lane