Re: Join optimization for inheritance tables - Mailing list pgsql-hackers
From | Herodotos Herodotou |
---|---|
Subject | Re: Join optimization for inheritance tables |
Date | |
Msg-id | 43826FCDC252204EA7823B2E7CF3CCEC25FC2F4B@Pandora.AsterData.local Whole thread Raw |
In response to | Re: Join optimization for inheritance tables (Greg Stark <gsstark@mit.edu>) |
Responses |
Re: Join optimization for inheritance tables
Re: Join optimization for inheritance tables |
List | pgsql-hackers |
This patch extends the query optimizer to consider joins between child tables when hierarchies are joined together. Short description: when the optimizer considers join paths between two tables with child tables, it only creates join pathsover two append paths. In this patch we extend the optimizer to consider joins between the child tables from the twoparent relations, based on the child table constraints. The idea is that only child tables with overlapping constraintsneed to be joined, and not the entire hierarchies with each other. This optimization can provide significant benefitsin terms of execution time. A short motivation example as well as the overall algorithm description could be foundin the attached presentation. Herodotos & Nedyalko ________________________________________ From: gsstark@gmail.com [gsstark@gmail.com] On Behalf Of Greg Stark [gsstark@mit.edu] Sent: Monday, July 06, 2009 3:14 PM To: Nedyalko Borisov Cc: Tom Lane; pgsql-hackers@postgresql.org; Herodotos Herodotou; Eric Friedman; John Cieslewicz; Dheeraj Pandey Subject: Re: Join optimization for inheritance tables On Mon, Jul 6, 2009 at 10:23 PM, Nedyalko Borisov<nedyalko@asterdata.com> wrote: > For example, typical observed scenario is: optimizer chooses Merge Join for > two partitioned tables like the plan below: > Merge Cond: (table1.id = table2.id) > -> Sort > Sort Key: id > -> Append > -> Seq Scan on table1_part1 > -> Seq Scan on table1_part2 > -> .... > -> Seq Scan on table1_partN > -> Sort > Sort Key: id > -> Append > -> Seq Scan on table2_part1 > -> Seq Scan on table2_part2 > -> .... > -> Seq Scan on table2_partM > > This plan ignores the fact there are indexes on the table2 partitions and > that the pairwise partitions joins (index nested loop or hash join) will be > faster than scanning all the partitions and sorting them. To some degree my merge-append patch would mitigate this case. It would allow the use of indexes on some or all the partitions to avoid the sorts. However it would still force all the partitions to be appended on each side and then merged. If we could match up all the partitions then I think this plan would be faster with the Append on top and separate merge joins for each pair of partitions. Aside from skipping the cpu cost of the merge-append I think it would win for a few other reasons as well. Each join would be able to come up with much better statistics which would enable it to pick a better join when one is available. Even if the planner still picks a merge join it would be much more likely to finish early and skip the remainder of a partition on one side or the other. > OK, implementing a special "abstract"/"parent" table would make more sense. I had in mind to tackle this in a bit of a roundabout way. If we mark the parent table "read-only" then notice that all tuples (all 0 of them) in the table are frozen then we can discard that table from the plans. Since changing the "read-only" attribute would have to be treated as a DDL operation which would invalidate any cached plans we can trust that it won't change as long as the plan lives so no new tuples can be inserted. The reason I wanted to take such a roundabout route instead of having an "abstract" or "empty" property is that a wanted to generalize this. Once we know a table is read-only then there are lots of properties we could find useful in planning aside from emptiness. We could have statistics like the minimum and maximum value for a column which the planner would be able to trust and exclude partitions without having to explicitly declare constraints on every column. This is all just my musings, not any kind of consensus. Does it make sense to others or is it too baroque when a simple "abstract" flag would do? -- greg http://mit.edu/~gsstark/resume.pdf
Attachment
pgsql-hackers by date: