I have a massive join that takes something like 10 seconds to execute in
Postgresql 7.2. Can any of gurus here help me improve it?
It is as follows:
SELECT DISTINCT cs.ContentID AS cscid, ct.Name AS ctname, pb.Name AS
pbname, cs.Author AS author, cs.CreationDate AS fromtime, cs.ExpireDate
AS totime, ct.Name AS media
FROM
ContentSummaries AS cs JOIN ContentTypes AS ct ON (ct.ContentTypeID =
cs.ContentTypeID)
JOIN ContentAttributes AS cab ON (cab.ContentID = cs.ContentID)
JOIN Attributes AS ab ON (ab.AttributeID = cab.AttributeID)
JOIN Categories AS cat ON (cat.CategoryID = cs.CategoryID)
LEFT OUTER JOIN Publishers AS pb ON (pb.PublisherID = cs.PublisherID)
WHERE cs.CreationDate IS NOT NULL
I already created indexes on all possible fields in those tables.
the query plan:
Limit (cost=3170.75..3173.00 rows=15 width=172) -> Unique (cost=3170.75..3203.77 rows=220 width=172) -> Sort
(cost=3170.75..3170.75rows=2201 width=172) -> Nested Loop (cost=611.64..3048.54 rows=2201
width=172) -> Hash Join (cost=611.64..3021.02 rows=2201
width=148) -> Hash Join (cost=610.61..2931.93 rows=2201
width=136) -> Hash Join (cost=609.44..2842.53
rows=2201 width=124) -> Seq Scan on contentattributes
cab (cost=0.00..867.05 rows=33005 width=24) -> Hash (cost=600.27..600.27
rows=3667 width=100) -> Hash Join
(cost=1.02..600.27 rows=3667 width=100) -> Seq Scan on
contentsummaries cs (cost=0.00..452.52 rows=3667 width=76) -> Hash
(cost=1.02..1.02 rows=2 width=24) -> Seq Scan on
contenttypes ct (cost=0.00..1.02 rows=2 width=24) -> Hash (cost=1.14..1.14 rows=14
width=12) -> Seq Scan on attributes ab
(cost=0.00..1.14 rows=14 width=12) -> Hash (cost=1.02..1.02 rows=2 width=12)
-> Seq Scan on categories cat
(cost=0.00..1.02 rows=2 width=12) -> Seq Scan on publishers pb (cost=0.00..0.00
rows=1 width=24)
Thanks in advance. I had been really scratching my head for this one.
--
Wei Weng
Network Software Engineer
KenCast Inc.