Re: Slow query: table iteration (8.3) - Mailing list pgsql-performance

From Yeb Havinga
Subject Re: Slow query: table iteration (8.3)
Date
Msg-id 4B6BFE33.5080809@gmail.com
Whole thread Raw
In response to Re: Slow query: table iteration (8.3)  (Glenn Maynard <glenn@zewt.org>)
Responses Re: Slow query: table iteration (8.3)
List pgsql-performance
Glenn Maynard wrote:
> The function version:
>
> CREATE FUNCTION high_score_for_user(user_id int) RETURNS SETOF INTEGER
> LANGUAGE SQL AS $$
>        SELECT t.id FROM test t
>        WHERE t.user_id = $1
>        ORDER BY t.score DESC LIMIT 1
> $$;
> SELECT high_score_for_user(u.id) FROM test_users u;
>
> runs in 100ms.
>
Hi Glenn,

About cached plans of SQL functions: from the source of function.c

00067 /*
00068  * An SQLFunctionCache record is built during the first call,
00069  * and linked to from the fn_extra field of the FmgrInfo struct.
00070  *
00071  * Note that currently this has only the lifespan of the calling
query.
00072  * Someday we might want to consider caching the parse/plan
results longer
00073  * than that.
00074  */

So it is planned at every call of

SELECT high_score_for_user(u.id) FROM test_users u;

and the cache is used between each row of test_users. The plan is with a
parameter, that means the optimizer could not make use of an actual
value during planning. However, your test case is clever in the sense
that there is an index on users and score and the sql function has an
order by that matches the index, so the planner can avoid a sort by
accessing the test table using the index. In this particular case, that
means that the plan is optimal; no unneeded tuples are processed and the
(function) plan complexity is logaritmic on the size of the test
relation, you can't get it any better than that. In short: the lack of
an actual parameter in the test case did not result in an inferior plan.
So using a dynamic constructed query string in pl/pgsql to 'force'
replanning during iteration cannot be faster than this sql function.

It is possible to make the performance if this function worse by
disabling indexscans:

CREATE FUNCTION high_score_for_user(user_id int) RETURNS SETOF INTEGER
LANGUAGE SQL AS $$
       SELECT t.id FROM test t
       WHERE t.user_id = $1
       ORDER BY t.score DESC LIMIT 1
$$
SET enable_indexscan = off;

Now the query time with test_users is over a second. So maybe the
converse could also be true in your production setup using the same
technique.

regards,
Yeb Havinga









pgsql-performance by date:

Previous
From: "Albe Laurenz"
Date:
Subject: Re: foreign key constraint lock behavour in postgresql
Next
From: "Wojtek"
Date:
Subject: index on partitioned table