Re: SQL statement over 500% slower with 9.2 compared with 9.1 - Mailing list pgsql-performance

From Jeff Janes
Subject Re: SQL statement over 500% slower with 9.2 compared with 9.1
Date
Msg-id CAMkU=1xQTqZvJed7cVzGo6soNv=94BAJo-Crof5NuGqYfABfCA@mail.gmail.com
Whole thread Raw
In response to SQL statement over 500% slower with 9.2 compared with 9.1  (Rafael Martinez <r.m.guerrero@usit.uio.no>)
Responses Re: SQL statement over 500% slower with 9.2 compared with 9.1
List pgsql-performance
On Monday, August 26, 2013, Rafael Martinez wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello

We have a SQL statement that with 9.1 takes ca 4000ms to finnish and
with 9.2 over 22000ms.

The explain analyze information is here:

Could you do explain (analyze, buffers) of these?  
 

With 9.1.:
http://explain.depesz.com/s/5ou

With 9.2
http://explain.depesz.com/s/d4vU

The SQL statement is:

SELECT  firstname || ' ' || lastname AS Name
FROM    Person R
WHERE  R.gender like 'F'
AND  19 < (SELECT COUNT(DISTINCT filmId)
              FROM   FilmParticipation F
              WHERE  F.partType = 'director' AND
                     F.personId = R.personId    )

What happens if you excise the "19 < (select ...)" clause?

That would greatly simplify the analysis, assuming the problem remains.

How many distinct filmId are there?




We can see that the query plan is very different between versions and
that 9.2 is really wrong with the number of rows involved. Why is 9.2
taking so wrong about the number of rows involved in some parts of the
plan?

Most directors are not also actors, so there is a strong negative correlation that PostgreSQL is not aware of. However, I think if you could get 9.1 to report the same path, it would be just as wrong on that estimate.  But since it doesn't report the same path, you don't see how wrong it is.

Try running:

explain (analyze, buffers)
 SELECT  D.personId
                FROM    FilmParticipation D
                WHERE   D.partType = 'director'
                        --AND D.personId = R.personId
                        AND NOT EXISTS (
                                SELECT  *
                                FROM    FilmParticipation C
                                WHERE   C.partType = 'cast'
                                        AND C.filmId = D.filmId
                                        AND C.personId = D.personId
                                       );

On both 9.1 and 9.2.

Cheers,

Jeff

pgsql-performance by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Poor performance on simple queries compared to sql server express
Next
From: Rafael Martinez
Date:
Subject: Re: SQL statement over 500% slower with 9.2 compared with 9.1