Inheritance and query plans - Mailing list pgsql-general
From | apb18@cornell.edu |
---|---|
Subject | Inheritance and query plans |
Date | |
Msg-id | Pine.SOL.3.91.1030922144054.24160D-100000@travelers.mail.cornell.edu Whole thread Raw |
List | pgsql-general |
A bit of query plan strangeness. Suppose you have an inheritance tree such that the columns 'ID' and 'field' appear in the top level table, call that table XXX. tables YYY and ZZZ both inherit XXX. Now suppose there exists some query that returns a set of IDs that match some criteria (that query may involve various tests/joins/etc on other arbitrary tables). Executing that query alone produces an optimal plan and the exact result set desired. If you want to display the contents of the 'field' column for each element of XXX, YYY, and ZZZ that contains an ID that is in this set, have found (with 7.3.4 at least) that the planner appears to be making a suboptimal plan when joining this set with an inherited table, whereas without inheritance it makes a good plan. Here are what the plans look like: GOOD Query: Select field from only XXX, (SELECT ID from [subquery that returns set of IDs]) as success where success.ID = XXX.ID ------ Nested loop -> Subquery scan [good plan that gets set of IDs that meet some criteria] ... -> Index scan on XXX BAD Query: Select field from XXX, (SELECT ID from [subquery query that returns set of IDs]) as success where success.ID = XXX.ID --------- Hash Join Hash Cond:("outer".id = "inner".id) -> Append -> Seq scan on XXX -> Seq scan on YYY -> Seq scan on ZZZ Hash [good plan that gets set of IDs that meet some criteria] ... In the first 'good' one, I essentially disable inheritance, and it produces a good, fast plan. For the query that uses inheritance, a terrible plan is produced. All tables XXX, YYY, and ZZZ are indexed properly, and either one when subsituted into the query alone produces a good plan (everything is vacuumed & analyzed and all ID fields are of the same type). Disabling hash joins, merge joins, sequential scans, etc will not ever turn a "bad" plan into a "good" one. Wouldn't a 'good' plan with inheritance append the results of index scans on XXX, YYY, and ZZZ instead of going to sequential scans (assume all tables are very big)? I tried this with 7.4 and basically the same thing happens (with 7.4 the 'good' plans can be reformulated to use 'IN' and they're still good, but the 'bad' plan is still bad). So.. Is this as much of a bad thing as I believe, or does it look like I made a mistake somewhere or could have done something differently? After a fair bit of head scratching, I'm pretty sure it's Postgres's fault, but I'd like a second opinion. In the case that it is.. I've seen quite a TODO list for inheritance, should possibly be added to the pile?* -Aaron *...one that I'd be interested in working on. I have some ideas that would possibly solve multiple problems, but have not proven to myself that they are good enough to be worth the trouble... mostly because I haven't defined 'trouble' or 'good' yet :)
pgsql-general by date: