The following bug has been logged online:
Bug reference: 2730
Logged by: Baudracco Pierre
Email address: pierre.baudracco@aliasource.fr
PostgreSQL version: 8.1.5
Operating system: Linux Debian unstable (SID)
Description: strange query performance !
Details:
Debian : ii postgresql-8.1 8.1.5-1
on a simple data model (a contact, linked to a company and categories
categorizing a contact) this query takes more than 30 seconds !
SELECT distinct contact_id
FROM Contact
LEFT JOIN Company ON contact_company_id=company_id
LEFT JOIN CategoryLink AS cl ON
contact_id=cl.categorylink_entity_id
AND cl.categorylink_entity='contact'
AND cl.categorylink_category='contactcategory2'
WHERE cl.categorylink_category_id='268' and contact_archive=0;
If I invert (only change) the LEFT JOIN clause the queries returns
immediately
SELECT distinct contact_id
FROM Contact
LEFT JOIN CategoryLink AS cl ON
contact_id=cl.categorylink_entity_id
AND cl.categorylink_entity='contact'
AND cl.categorylink_category='contactcategory2'
LEFT JOIN Company ON contact_company_id=company_id WHERE
cl.categorylink_category_id='268' and contact_archive=0;
if I supress one of the where clause, it returns immediately too... very
strange
here are the queries plan (explain):
first and slow one :
************************
obm=> explain SELECT distinct contact_id FROM Contact LEFT JOIN Company ON
contact_company_id=company_id LEFT JOIN CategoryLink AS cl ON
contact_id=cl.categorylink_entity_id AND cl.categorylink_entity='contact'
AND cl.categorylink_category='contactcategory2' WHERE
cl.categorylink_category_id='268' and contact_archive=0;
QUERY PLAN
----------------------------------------------------------------------------
----------------------------------------------------------------------------
-------------------------
Unique (cost=0.00..1423.38 rows=1 width=4)
-> Nested Loop (cost=0.00..1423.37 rows=1 width=4)
Join Filter: ("inner".contact_id = "outer".categorylink_entity_id)
-> Index Scan using categorylink_pkey on categorylink cl
(cost=0.00..4.87 rows=1 width=4)
Index Cond: ((categorylink_category_id = 268) AND
((categorylink_category)::text = 'contactcategory2'::text) AND
((categorylink_entity)::text = 'contact'::text))
-> Nested Loop Left Join (cost=0.00..1417.30 rows=96 width=4)
-> Seq Scan on contact (cost=0.00..840.51 rows=96 width=8)
Filter: ((contact_archive)::text = '0'::text)
-> Index Scan using company_pkey on company
(cost=0.00..6.00 rows=1 width=4)
Index Cond: ("outer".contact_company_id =
company.company_id)
I can see a strange seq scan on contact but why ???
*****************************************************
second one, really fast
obm=> explain SELECT distinct contact_id FROM Contact LEFT JOIN
CategoryLink AS cl ON contact_id=cl.categorylink_entity_id AND
cl.categorylink_entity='contact' AND
cl.categorylink_category='contactcategory2' LEFT JOIN Company on
contact_company_id=company_id WHERE cl.categorylink_category_id='268' and
contact_archive=0;
QUERY PLAN
----------------------------------------------------------------------------
----------------------------------------------------------------------------
-------------------------------
Unique (cost=0.00..16.91 rows=1 width=4)
-> Nested Loop Left Join (cost=0.00..16.90 rows=1 width=4)
-> Nested Loop (cost=0.00..10.90 rows=1 width=8)
-> Index Scan using categorylink_pkey on categorylink cl
(cost=0.00..4.87 rows=1 width=4)
Index Cond: ((categorylink_category_id = 268) AND
((categorylink_category)::text = 'contactcategory2'::text) AND
((categorylink_entity)::text = 'contact'::text))
-> Index Scan using contact_pkey on contact
(cost=0.00..6.01 rows=1 width=8)
Index Cond: (contact.contact_id =
"outer".categorylink_entity_id)
Filter: ((contact_archive)::text = '0'::text)
-> Index Scan using company_pkey on company (cost=0.00..6.00
rows=1 width=4)
Index Cond: ("outer".contact_company_id =
company.company_id)
Is it a bug or may I have missed something ??