Re: Joins on inherited tables - Mailing list pgsql-performance

From Tom Lane
Subject Re: Joins on inherited tables
Date
Msg-id 864.1065028461@sss.pgh.pa.us
Whole thread Raw
In response to Joins on inherited tables  (apb18@cornell.edu)
Responses Re: Joins on inherited tables
List pgsql-performance
apb18@cornell.edu writes:
> So.. does anybody have any advice?

Look at set_inherited_rel_pathlist() in allpaths.c --- it forms the best
plan for fully scanning the inheritance-tree table.  Currently that's
the *only* plan considered, and it does not make any use of join
clauses.  It's possible that something could be done with providing a
similar routine for best inner indexscans taken across the whole
inheritance tree.  You'd have to figure out how to locate the applicable
joinclauses though.  I think they'd be attached to the inheritance-tree
parent relation and not to the individual inheritance tree member
relations.  Also, if you are wondering why best_inner_indexscan() is so
tense about caching its results, that's because it gets called *a lot*
in large join problems.  If you don't achieve a similar level of
efficiency then you'll be seeing some nasty performance problems in
larger queries.

I think you'd find there is executor work to do as well; I'm not sure
how the outer-relation values would get propagated down into the
indexscans when there's an Append node between.  Maybe the existing code
would Just Work, but I bet not.

<digression>

Not sure if this will help you, but:

Once upon a time the planner did the APPEND for an inheritance tree at
the top of the plan not the bottom.  (It still does when the tree is the
target of an update/delete query.)  In 7.0 for example I get a plan like
this:

create table pt (f1 int primary key);
create table ct1 (f2 int) inherits (pt);
create table ct2 (f2 int) inherits (pt);
create index ct1i on ct1(f1);
create table bar(f1 int);

explain select * from pt*, bar where pt.f1 = bar.f1;
NOTICE:  QUERY PLAN:

Append  (cost=69.83..474.33 rows=30000 width=8)
    ->  Merge Join  (cost=69.83..154.83 rows=10000 width=8)
          ->  Index Scan using pt_pkey on pt  (cost=0.00..60.00 rows=1000 width=4)
          ->  Sort  (cost=69.83..69.83 rows=1000 width=4)
                ->  Seq Scan on bar  (cost=0.00..20.00 rows=1000 width=4)
    ->  Merge Join  (cost=69.83..154.83 rows=10000 width=8)
          ->  Index Scan using ct1i on ct1 pt  (cost=0.00..60.00 rows=1000 width=4)
          ->  Sort  (cost=69.83..69.83 rows=1000 width=4)
                ->  Seq Scan on bar  (cost=0.00..20.00 rows=1000 width=4)
    ->  Merge Join  (cost=139.66..164.66 rows=10000 width=8)
          ->  Sort  (cost=69.83..69.83 rows=1000 width=4)
                ->  Seq Scan on bar  (cost=0.00..20.00 rows=1000 width=4)
          ->  Sort  (cost=69.83..69.83 rows=1000 width=4)
                ->  Seq Scan on ct2 pt  (cost=0.00..20.00 rows=1000 width=4)

whereas the same test in CVS tip produces

                                 QUERY PLAN
----------------------------------------------------------------------------
 Merge Join  (cost=303.09..353.09 rows=3000 width=8)
   Merge Cond: ("outer".f1 = "inner".f1)
   ->  Sort  (cost=69.83..72.33 rows=1000 width=4)
         Sort Key: bar.f1
         ->  Seq Scan on bar  (cost=0.00..20.00 rows=1000 width=4)
   ->  Sort  (cost=233.26..240.76 rows=3000 width=4)
         Sort Key: public.pt.f1
         ->  Append  (cost=0.00..60.00 rows=3000 width=4)
               ->  Seq Scan on pt  (cost=0.00..20.00 rows=1000 width=4)
               ->  Seq Scan on ct1 pt  (cost=0.00..20.00 rows=1000 width=4)
               ->  Seq Scan on ct2 pt  (cost=0.00..20.00 rows=1000 width=4)

The fact that 7.0 could actually adapt to different index sets for
different child tables was kinda cool, but the append-at-the-top
strategy failed completely for outer joins, so we had to abandon it.
In practice I think the generated plan was usually worse anyway (note
that bar gets scanned three times in 7.0's plan), but for the specific
case where the inheritance tree is on the inside of a nestloop that
could be indexed, the new approach is not as good.  If you can come up
with a fix that doesn't break things in other respects, it'd be great.

[digs in CVS logs]  The patch that altered the APPEND-at-the-top
behavior was this one:

2000-11-11 19:36  tgl

    * src/: backend/commands/command.c, backend/commands/copy.c,
    backend/commands/explain.c, backend/executor/execMain.c,
    backend/executor/execQual.c, backend/executor/execTuples.c,
    backend/executor/execUtils.c, backend/executor/functions.c,
    backend/executor/nodeAppend.c, backend/executor/nodeSeqscan.c,
    backend/nodes/copyfuncs.c, backend/nodes/equalfuncs.c,
    backend/nodes/outfuncs.c, backend/nodes/readfuncs.c,
    backend/optimizer/path/allpaths.c,
    backend/optimizer/path/pathkeys.c,
    backend/optimizer/plan/createplan.c,
    backend/optimizer/plan/planmain.c,
    backend/optimizer/plan/planner.c,
    backend/optimizer/prep/prepunion.c,
    backend/optimizer/util/pathnode.c,
    backend/optimizer/util/relnode.c, backend/parser/parse_clause.c,
    backend/tcop/pquery.c, include/catalog/catversion.h,
    include/executor/executor.h, include/executor/tuptable.h,
    include/nodes/execnodes.h, include/nodes/nodes.h,
    include/nodes/parsenodes.h, include/nodes/plannodes.h,
    include/nodes/relation.h, include/optimizer/pathnode.h,
    include/optimizer/planmain.h, include/optimizer/planner.h,
    include/optimizer/prep.h, backend/optimizer/README: Restructure
    handling of inheritance queries so that they work with outer joins,
    and clean things up a good deal at the same time.  Append plan node
    no longer hacks on rangetable at runtime --- instead, all child
    tables are given their own RT entries during planning.    Concept of
    multiple target tables pushed up into execMain, replacing bug-prone
    implementation within nodeAppend.  Planner now supports generating
    Append plans for inheritance sets either at the top of the plan
    (the old way) or at the bottom.  Expanding at the bottom is
    appropriate for tables used as sources, since they may appear
    inside an outer join; but we must still expand at the top when the
    target of an UPDATE or DELETE is an inheritance set, because we
    actually need a different targetlist and junkfilter for each target
    table in that case.  Fortunately a target table can't be inside an
    outer join...  Bizarre mutual recursion between union_planner and
    prepunion.c is gone --- in fact, union_planner doesn't really have
    much to do with union queries anymore, so I renamed it
    grouping_planner.

Not sure if studying that diff would teach you anything useful, but
there it is...

</digression>

            regards, tom lane

pgsql-performance by date:

Previous
From: Josh Berkus
Date:
Subject: Re: TPC-R benchmarks
Next
From: Christopher Browne
Date:
Subject: Re: inferior SCSI performance