Thread: index/join madness
Ok, I've built the most insane query ever. It joins 11 tables, most of which are the same table, just extracting different values. Here is the query plan: Nested Loop (cost=0.00..5011.89 rows=1 width=156) -> Nested Loop (cost=0.00..4191.82 rows=1 width=140) -> NestedLoop (cost=0.00..4189.79 rows=1 width=112) -> Nested Loop (cost=0.00..4188.58 rows=1 width=104) -> Nested Loop (cost=0.00..4186.55 rows=1 width=88) -> Nested Loop (cost=0.00..3366.48 rows=1 width=72) -> Nested Loop (cost=0.00..2546.41 rows=1 width=68) -> Nested Loop (cost=0.00..1726.34 rows=1 width=52) -> Nested Loop (cost=0.00..906.27 rows=1 width=32) -> Seq Scan on formdata f6 (cost=0.00..904.16 rows=1 width=4) -> Index Scan using users_pkey on users u (cost=0.00..2.02 rows=1 width=28) SubPlan -> Seq Scan on sentletters (cost=0.00..0.00 rows=1 width=4) -> Seq Scan on formdata f5 (cost=0.00..818.42 rows=131 width=20) -> Seq Scan on formdata f2 (cost=0.00..818.42 rows=131 width=16) -> Seq Scan on formdata f1 (cost=0.00..818.42 rows=131 width=4) -> Seq Scan on formdata f3 (cost=0.00..818.42 rows=131 width=16) -> Index Scan using formmvl_pkey on formmvl m (cost=0.00..2.02 rows=1 width=16) -> Seq Scan on relations r (cost=0.00..1.12 rows=7 width=8) -> Index Scan using users_pkey on users u2 (cost=0.00..2.02 rows=1 width=28) -> Seq Scan on formdata f4 (cost=0.00..818.42 rows=131 width=16) If anyone has a screen wide enough to see this, you will see that the majority of the time is spent doing sequential scans on the formdata table. This table needs formid, fieldid and userid to find the value I'm looking for. It has one index defined on: Index "formdata_pkey"Attribute | Type -----------+---------formid | integeroccid | integeruserid | integerfieldid | integer unique btree (primary key) In my case I'm ignoring occid since it's always 1 for these values. Is there any way I can coerce this into using a multifield index? My joins generally look like: JOIN formdata AS f2 ON (u.id=f2.userid AND f2.formid=1 AND f2.fieldid=2) I'm a little curious as to why it's not using the primary key... Finally, I'm planning on moving this to 7.2 and converting all the joins to use outer joins. Will there be a significant penalty in performance running outer joins? -Michael _________________________________________________________________ http://fastmail.ca/ - Fast Free Web Email for Canadians
Michael Richards <michael@fastmail.ca> wrote: > Ok, I've built the most insane query ever. Indeed. Kindly stay away from sharp objects and wait for the friendly people to bring you to the bouncy room :-) > If anyone has a screen wide enough to see this, you will see that the > majority of the time is spent doing sequential scans on the formdata > table. This table needs formid, fieldid and userid to find the value I'm > looking for. > In my case I'm ignoring occid since it's always 1 for these values. What happens if you put in an "and occid = 1"? Perhaps it's the case that you know it's always one, but the query planner doesn't realise this. > I'm a little curious as to why it's not using the primary key... Another possible cause which I've seen in a couple of recent threads is that the query analyzer isn't very smart about the relationships between types, so it may help to explicitly cast values to the exact type used in the index. HTH, Ray -- "Never trust a poll you haven't rigged yourself."
On Wed, 23 May 2001, Michael Richards wrote: > Finally, I'm planning on moving this to 7.2 and converting all the > joins to use outer joins. Will there be a significant penalty in > performance running outer joins? Why are you planning on using outer joins? Yes there is a performance penalty because postgres will have to emit more tuples. Are you sure that you need to use outer joins? -- Dave
"Michael Richards" <michael@fastmail.ca> writes: > [ a severely incomplete problem description ] Table schema? Full text of the query? > It has one index defined on: > Index "formdata_pkey" > Attribute | Type > -----------+--------- > formid | integer > occid | integer > userid | integer > fieldid | integer > unique btree (primary key) > In my case I'm ignoring occid since it's always 1 for these values. > Is there any way I can coerce this into using a multifield index? It won't use *that* multifield index, at least not as a multifield index, if you provide no constraint on occid. Per the documentation: : The query optimizer can use a multi-column index for queries that : involve the first n consecutive columns in the index (when used with : appropriate operators), up to the total number of columns specified in : the index definition. For example, an index on (a, b, c) can be used in : queries involving all of a, b, and c, or in queries involving both a and : b, or in queries involving only a, but not in other combinations. (In a : query involving a and c the optimizer might choose to use the index for : a only and treat c like an ordinary unindexed column.) > Finally, I'm planning on moving this to 7.2 and converting all the > joins to use outer joins. Will there be a significant penalty in > performance running outer joins? Compared to what? Outer joins are surely a lot faster than most of the possible substitutes, but you didn't tell us what you're doing instead. regards, tom lane