Re: Joins on inherited tables - Mailing list pgsql-performance
From | Tom Lane |
---|---|
Subject | Re: Joins on inherited tables |
Date | |
Msg-id | 864.1065028461@sss.pgh.pa.us Whole thread Raw |
In response to | Joins on inherited tables (apb18@cornell.edu) |
Responses |
Re: Joins on inherited tables
|
List | pgsql-performance |
apb18@cornell.edu writes: > So.. does anybody have any advice? Look at set_inherited_rel_pathlist() in allpaths.c --- it forms the best plan for fully scanning the inheritance-tree table. Currently that's the *only* plan considered, and it does not make any use of join clauses. It's possible that something could be done with providing a similar routine for best inner indexscans taken across the whole inheritance tree. You'd have to figure out how to locate the applicable joinclauses though. I think they'd be attached to the inheritance-tree parent relation and not to the individual inheritance tree member relations. Also, if you are wondering why best_inner_indexscan() is so tense about caching its results, that's because it gets called *a lot* in large join problems. If you don't achieve a similar level of efficiency then you'll be seeing some nasty performance problems in larger queries. I think you'd find there is executor work to do as well; I'm not sure how the outer-relation values would get propagated down into the indexscans when there's an Append node between. Maybe the existing code would Just Work, but I bet not. <digression> Not sure if this will help you, but: Once upon a time the planner did the APPEND for an inheritance tree at the top of the plan not the bottom. (It still does when the tree is the target of an update/delete query.) In 7.0 for example I get a plan like this: create table pt (f1 int primary key); create table ct1 (f2 int) inherits (pt); create table ct2 (f2 int) inherits (pt); create index ct1i on ct1(f1); create table bar(f1 int); explain select * from pt*, bar where pt.f1 = bar.f1; NOTICE: QUERY PLAN: Append (cost=69.83..474.33 rows=30000 width=8) -> Merge Join (cost=69.83..154.83 rows=10000 width=8) -> Index Scan using pt_pkey on pt (cost=0.00..60.00 rows=1000 width=4) -> Sort (cost=69.83..69.83 rows=1000 width=4) -> Seq Scan on bar (cost=0.00..20.00 rows=1000 width=4) -> Merge Join (cost=69.83..154.83 rows=10000 width=8) -> Index Scan using ct1i on ct1 pt (cost=0.00..60.00 rows=1000 width=4) -> Sort (cost=69.83..69.83 rows=1000 width=4) -> Seq Scan on bar (cost=0.00..20.00 rows=1000 width=4) -> Merge Join (cost=139.66..164.66 rows=10000 width=8) -> Sort (cost=69.83..69.83 rows=1000 width=4) -> Seq Scan on bar (cost=0.00..20.00 rows=1000 width=4) -> Sort (cost=69.83..69.83 rows=1000 width=4) -> Seq Scan on ct2 pt (cost=0.00..20.00 rows=1000 width=4) whereas the same test in CVS tip produces QUERY PLAN ---------------------------------------------------------------------------- Merge Join (cost=303.09..353.09 rows=3000 width=8) Merge Cond: ("outer".f1 = "inner".f1) -> Sort (cost=69.83..72.33 rows=1000 width=4) Sort Key: bar.f1 -> Seq Scan on bar (cost=0.00..20.00 rows=1000 width=4) -> Sort (cost=233.26..240.76 rows=3000 width=4) Sort Key: public.pt.f1 -> Append (cost=0.00..60.00 rows=3000 width=4) -> Seq Scan on pt (cost=0.00..20.00 rows=1000 width=4) -> Seq Scan on ct1 pt (cost=0.00..20.00 rows=1000 width=4) -> Seq Scan on ct2 pt (cost=0.00..20.00 rows=1000 width=4) The fact that 7.0 could actually adapt to different index sets for different child tables was kinda cool, but the append-at-the-top strategy failed completely for outer joins, so we had to abandon it. In practice I think the generated plan was usually worse anyway (note that bar gets scanned three times in 7.0's plan), but for the specific case where the inheritance tree is on the inside of a nestloop that could be indexed, the new approach is not as good. If you can come up with a fix that doesn't break things in other respects, it'd be great. [digs in CVS logs] The patch that altered the APPEND-at-the-top behavior was this one: 2000-11-11 19:36 tgl * src/: backend/commands/command.c, backend/commands/copy.c, backend/commands/explain.c, backend/executor/execMain.c, backend/executor/execQual.c, backend/executor/execTuples.c, backend/executor/execUtils.c, backend/executor/functions.c, backend/executor/nodeAppend.c, backend/executor/nodeSeqscan.c, backend/nodes/copyfuncs.c, backend/nodes/equalfuncs.c, backend/nodes/outfuncs.c, backend/nodes/readfuncs.c, backend/optimizer/path/allpaths.c, backend/optimizer/path/pathkeys.c, backend/optimizer/plan/createplan.c, backend/optimizer/plan/planmain.c, backend/optimizer/plan/planner.c, backend/optimizer/prep/prepunion.c, backend/optimizer/util/pathnode.c, backend/optimizer/util/relnode.c, backend/parser/parse_clause.c, backend/tcop/pquery.c, include/catalog/catversion.h, include/executor/executor.h, include/executor/tuptable.h, include/nodes/execnodes.h, include/nodes/nodes.h, include/nodes/parsenodes.h, include/nodes/plannodes.h, include/nodes/relation.h, include/optimizer/pathnode.h, include/optimizer/planmain.h, include/optimizer/planner.h, include/optimizer/prep.h, backend/optimizer/README: Restructure handling of inheritance queries so that they work with outer joins, and clean things up a good deal at the same time. Append plan node no longer hacks on rangetable at runtime --- instead, all child tables are given their own RT entries during planning. Concept of multiple target tables pushed up into execMain, replacing bug-prone implementation within nodeAppend. Planner now supports generating Append plans for inheritance sets either at the top of the plan (the old way) or at the bottom. Expanding at the bottom is appropriate for tables used as sources, since they may appear inside an outer join; but we must still expand at the top when the target of an UPDATE or DELETE is an inheritance set, because we actually need a different targetlist and junkfilter for each target table in that case. Fortunately a target table can't be inside an outer join... Bizarre mutual recursion between union_planner and prepunion.c is gone --- in fact, union_planner doesn't really have much to do with union queries anymore, so I renamed it grouping_planner. Not sure if studying that diff would teach you anything useful, but there it is... </digression> regards, tom lane
pgsql-performance by date: