ORDER BY user defined function performance issues - Mailing list pgsql-performance

From Nick Trainor
Subject ORDER BY user defined function performance issues
Date
Msg-id 20040611110837.E5F5215D888@ns.trainorthornton.co.uk
Whole thread Raw
Responses Re: ORDER BY user defined function performance issues
Re: ORDER BY user defined function performance issues
List pgsql-performance
I have an application which logs interactions on a regular basis.  The interaction details (their types, etc) are held
inone table (tblitem) and the 'hits' are held in tbltotal. 

I have written a function to get the total 'hits' during a period and need to collect together the information from
tblitemwith it. 

The query works OK returning results in under a second:

EXPLAIN ANALYSE SELECT
t1.value1,t1.value2,getday_total('1','23',t1.id::integer,'31','59','2','2004','182','153','6','2004','0')FROM tblitem
t1WHERE  t1.type_id=23::int2  and (t1.id >= 1::int8 and t1.id<=9223372036854775807::int8) 
OFFSET 0 LIMIT 20;
tracker-#                                                                QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..7.70 rows=20 width=56) (actual time=19.50..846.89 rows=20 loops=1)
   ->  Index Scan using tblitemtype_id on tblitem t1  (cost=0.00..230.10 rows=598 width=56) (actual time=19.49..846.81
rows=21loops=1) 
         Index Cond: (type_id = 23::smallint)
         Filter: ((id >= 1::bigint) AND (id <= 9223372036854775807::bigint))
 Total runtime: 847.04 msec

----
I realised that Postgresql did not like passing t1.id to the function without some form of constraints - hence the
(t1.id>= 1::int8 and t1.id<=9223372036854775807::int8) dummy constraints. 
----


However, when I seek to ORDER the results, then it takes 'forever':

EXPLAIN ANALYSE SELECT t1.value1,t1.value2,
getday_total('1','23',t1.id::integer,'31','59','2','2004','182','153','6','2004','0') FROM tblitem t1 WHERE
t1.type_id=23::int2 and (t1.id >= 1::int8 and t1.id<=9223372036854775807::int8) 
ORDER BY getday_total('1','23',t1.id::integer,'31','59','2','2004','182','153','6','2004','0') DESC
OFFSET 0 LIMIT 20;


tracker-# tracker-#
                                                                                           QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=257.66..257.71 rows=20 width=56) (actual time=25930.90..25930.95 rows=20 loops=1)
   ->  Sort  (cost=257.66..259.15 rows=598 width=56) (actual time=25930.90..25930.91 rows=21 loops=1)
         Sort Key: getday_total(1::smallint, 23::smallint, (id)::integer, 31::smallint, 59::smallint, 2::smallint,
2004::smallint,182::smallint, 153::smallint, 6::smallint, 2004::smallint, 0) 
         ->  Index Scan using tblitemtype_id on tblitem t1  (cost=0.00..230.10 rows=598 width=56) (actual
time=19.60..25927.68rows=693 loops=1) 
               Index Cond: (type_id = 23::smallint)
               Filter: ((id >= 1::bigint) AND (id <= 9223372036854775807::bigint))
 Total runtime: 25931.15 msec


And this is a database of only a few thousand rows, we are anticipating that this database is going to get huge.

What am I missing here?  How can I get it to order by the total of interactions without hitting the performance
problem?

Any help would be much appreciated.

Nick

nick A-T trainorthornton d-o-t co d-o-t uk



Version:
PostgreSQL 7.3.4 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.2 (Mandrake Linux 9.1 3.2.2-3mdk)

pgsql-performance by date:

Previous
From: Stephan Szabo
Date:
Subject: Re: *very* inefficient choice made by the planner (regarding
Next
From: Paul Thomas
Date:
Subject: Re: ORDER BY user defined function performance issues