On Monday, August 26, 2013, Rafael Martinez wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello
We have a SQL statement that with 9.1 takes ca 4000ms to finnish and
with 9.2 over 22000ms.
The explain analyze information is here:
Could you do explain (analyze, buffers) of these?
With 9.1.:
http://explain.depesz.com/s/5ou
With 9.2
http://explain.depesz.com/s/d4vU
The SQL statement is:
SELECT firstname || ' ' || lastname AS Name
FROM Person R
WHERE R.gender like 'F'
AND 19 < (SELECT COUNT(DISTINCT filmId)
FROM FilmParticipation F
WHERE F.partType = 'director' AND
F.personId = R.personId )
What happens if you excise the "19 < (select ...)" clause?
That would greatly simplify the analysis, assuming the problem remains.
How many distinct filmId are there?
We can see that the query plan is very different between versions and
that 9.2 is really wrong with the number of rows involved. Why is 9.2
taking so wrong about the number of rows involved in some parts of the
plan?
Most directors are not also actors, so there is a strong negative correlation that PostgreSQL is not aware of. However, I think if you could get 9.1 to report the same path, it would be just as wrong on that estimate. But since it doesn't report the same path, you don't see how wrong it is.
Try running:
explain (analyze, buffers)
SELECT D.personId
FROM FilmParticipation D
WHERE D.partType = 'director'
--AND D.personId = R.personId
AND NOT EXISTS (
SELECT *
FROM FilmParticipation C
WHERE C.partType = 'cast'
AND C.filmId = D.filmId
AND C.personId = D.personId
);
On both 9.1 and 9.2.
Cheers,
Jeff