Re: Slow queries consisting inner selects and order bys & hack to speed up - Mailing list pgsql-performance

From Tom Lane
Subject Re: Slow queries consisting inner selects and order bys & hack to speed up
Date
Msg-id 7599.1138043954@sss.pgh.pa.us
Whole thread Raw
In response to Slow queries consisting inner selects and order bys & hack to speed up  (Ümit Öztosun <umit@likyabilisim.com>)
List pgsql-performance
=?ISO-8859-1?Q?=DCmit_=D6ztosun?= <umit@likyabilisim.com> writes:
> Our application uses typical queries similar to following (very simplified):

> SELECT
>  part_id,
>  part_name,
>  (SELECT
>   SUM(amount) FROM part_movements M
>   WHERE P.part_id =3D M.part_id
>  ) as part_amount
> FROM parts P
> ORDER BY part_name
> LIMIT 50

> Postgres seems to compute all possible rows and then sorts the
> results, which nearly renders the paging meaningless.

Yeah.  The general rule is that sorting happens after computing the
SELECT values --- this is more or less required for cases where the
ORDER BY refers to a SELECT-list item.  You'd probably have better
results by writing a sub-select:

SELECT
 part_id,
 part_name,
 (SELECT
  SUM(amount) FROM part_movements M
  WHERE P.part_id = M.part_id
 ) as part_amount
FROM
 (SELECT part_id, part_name FROM parts P
  WHERE whatever ORDER BY whatever LIMIT n) as P;

This will do the part_movements stuff only for rows that make it out of
the sub-select.

Another approach is to make sure the ORDER BY is always on an indexed
column; in cases where the ORDER BY is done by an indexscan instead
of a sort, calculation of the unwanted SELECT-list items does not
happen.  However, this only works as long as LIMIT+OFFSET is fairly
small.

Lastly, are you on a reasonably current Postgres version (performance
complaints about anything older than 8.0 will no longer be accepted
with much grace), and are your statistics up to date?  The ANALYZE
shows rowcount estimates that seem suspiciously far off:

        ->  Seq Scan on scf_stokkart stok  (cost=0.00..142622.54 rows=25 width=503) (actual time=4.726..19324.633
rows=4947loops=1) 
              Filter: (upper((durum)::text) = 'A'::text)

This is important because, if the planner realized that the SELECT-list
items were going to be evaluated 5000 times not 25, it might well choose
a different plan.

            regards, tom lane

pgsql-performance by date:

Previous
From: Frank Wiles
Date:
Subject: Re: [PERFORMANCE] Stored Procedures
Next
From: "Adnan HOTMAIL"
Date:
Subject: unsubscribe