Re: can I move sort to first outer join ? - Mailing list pgsql-performance

From PFC
Subject Re: can I move sort to first outer join ?
Date
Msg-id op.ua41y7z7cigqcu@apollo13.peufeu.com
Whole thread Raw
In response to can I move sort to first outer join ?  (fernando castano <Fernando.Castano@Sun.COM>)
List pgsql-performance
On Wed, 14 May 2008 06:40:40 +0200, fernando castano
<Fernando.Castano@Sun.COM> wrote:

>
> Hi all,
>
> This sql is taking too long for the size of my tiny db.  Any tips from
> this alias?  I tried moving the sort to the first left outer join
> (between projects and features tables) using a nested subquery, but
> postgres tells me only one column could be returned from a subqueyr.

    Instead of :

    SELECT * FROM a LEFT JOIN b LEFT JOIN c WHERE c.column=... ORDER BY c.x
LIMIT N

    You could write :

    SELECT * FROM a LEFT JOIN b LEFT JOIN (SELECT * FROM c WHERE c.column=...
ORDER BY c.x LIMIT N) AS cc ORDER BY cc.x LIMIT N

    This is only interesting of you use a LIMIT and this allows you to reduce
the number of rows sorted/joined.

    However in your case this is not the right thing to do since you do not
use LIMIT, and sorting your 846 rows will only take a very small time.
Your problem are those seq scans, you need to optimize that query so it
can use indexes.

>                      ->  Seq Scan on projects  (cost=0.00..10.90 rows=4
> width=1884) (actual time=0.039..0.109 rows=10 loops=1)
>                            Filter: (visible AND (id = ANY
> ('{3,4,5,6,10,7,8,9,13,11}'::integer[])))
>                      ->  Hash  (cost=10.40..10.40 rows=40 width=1870)
> (actual time=1.048..1.048 rows=101 loops=1)
>                            ->  Seq Scan on features  (cost=0.00..10.40
> rows=40 width=1870) (actual time=0.026..0.464 rows=101 loops=1)
>                ->  Hash  (cost=10.70..10.70 rows=70 width=1065) (actual
> time=0.098..0.098 rows=29 loops=1)
>                      ->  Seq Scan on person_roles  (cost=0.00..10.70
> rows=70 width=1065) (actual time=0.014..0.037 rows=29 loops=1)
>          ->  Hash  (cost=15.80..15.80 rows=580 width=106) (actual
> time=0.105..0.105 rows=32 loops=1)
>                ->  Seq Scan on project_tags  (cost=0.00..15.80 rows=580
> width=106) (actual time=0.013..0.036 rows=32 loops=1)
>  Total runtime: 149.622 ms

    All those seq scans !!!

    Please post, for each of those tables :

    - The total number of rows (SELECT count(*) is fine)
    - The table definitions with indexes (\d table)

    EXPLAIN ANALYZE tells you the number of rows it picked out of a seq scan
(that's the "rows=") but not the number of rows scanned... this is
important, because a seq scan on a small table isn't a problem, but on a
big one, it is.

pgsql-performance by date:

Previous
From: Rusty Conover
Date:
Subject: Re: Regexps - never completing join.
Next
From: Valentin Bogdanov
Date:
Subject: postgres overall performance seems to degrade when large SELECT are requested