Re: CLUSTER and a problem - Mailing list pgsql-performance

From Tom Lane
Subject Re: CLUSTER and a problem
Date
Msg-id 21876.1252970013@sss.pgh.pa.us
Whole thread Raw
In response to Re: CLUSTER and a problem  (Andrzej Zawadzki <zawadaa@wp.pl>)
Responses Re: CLUSTER and a problem  (Andrzej Zawadzki <zawadaa@wp.pl>)
List pgsql-performance
Andrzej Zawadzki <zawadaa@wp.pl> writes:
> # EXPLAIN ANALYZE SElect telekredytid from kredytyag
> WHERE TRUE
> AND kredytyag.id = 3064776
> AND NOT EXISTS
> (SELECT 1 FROM
> ( SELECT * FROM kredyty kr
> where telekredytid = 328650
> ORDER BY kr.datazaw DESC LIMIT 1 )
> kred where kred.bank = 2);

So this is the slow bit:

>      ->  Subquery Scan kred  (cost=0.00..778.06 rows=1 width=0) (actual
> time=2045556.496..2045556.496 rows=0 loops=1)
>            Filter: (kred.bank = 2)
>            ->  Limit  (cost=0.00..778.05 rows=1 width=3873) (actual
> time=2045556.492..2045556.492 rows=0 loops=1)
>                  ->  Index Scan Backward using kredyty_datazaw on
> kredyty kr  (cost=0.00..1088490.39 rows=1399 width=3873) (actual
> time=2045556.487..2045556.487 rows=0 loops=1)
>                        Filter: (telekredytid = 328650)

It's doing a scan in descending datazaw order and hoping to find a row
that has both telekredytid = 328650 and bank = 2.  Evidently there isn't
one, so the indexscan runs clear to the end before it can report that the
NOT EXISTS is true.  Unfortunately, you've more or less forced this
inefficient query plan by wrapping some of the search conditions inside a
LIMIT and some outside.  Try phrasing the NOT EXISTS query differently.
Or, if you do this type of query a lot, a special-purpose index might be
worthwhile.  It would probably be fast as-is if you had an index on
(telekredytid, datazaw) (in that order).

            regards, tom lane

pgsql-performance by date:

Previous
From: Andrzej Zawadzki
Date:
Subject: Re: CLUSTER and a problem
Next
From: Michael Glaesemann
Date:
Subject: Re: How to post Performance Questions