Joins on inherited tables - Mailing list pgsql-performance

From apb18@cornell.edu
Subject Joins on inherited tables
Date
Msg-id Pine.SOL.3.91.1031001105940.15800E-100000@travelers.mail.cornell.edu
Whole thread Raw
Responses Re: Joins on inherited tables
List pgsql-performance
Hi,
    In some situations, it looks like the optimizer does not chose
efficient paths for joining inherited tables.  For I created a
rather trivial formulation to serve as an example.  I created the table
'numbers' comprising of the columns id (int) and value (text).  I also
created the table 'evens' and 'odds' that inherit numbers, with no
additional columns.  Into 'evens' I placed 50000 entries, each one with
an even (unique) id and random 'value'.  Likewise, for 'odds', I created
50000 odd (and unique) id fields id fields and random 'value', and
created index on all ID fields of every table that has any rows (and
analyzed).

so.. my tables look like this:

    Table "public.numbers"
 Column |  Type   | Modifiers
--------+---------+-----------
 id     | integer |
 value  | text    |

     Table "public.evens"
 Column |  Type   | Modifiers
--------+---------+-----------
 id     | integer |
 value  | text    |
Indexes:
    "ei" btree (id)
Inherits: numbers

     Table "public.odds"
 Column |  Type   | Modifiers
--------+---------+-----------
 id     | integer |
 value  | text    |
Indexes:
    "oi" btree (id)
Inherits: numbers


As per the above construction, 'evens' and 'odds' both have 50000
rows. 'numbers' contains none.



Now, I created a trivial query that would use 'numbers' as an inheritor
table in a join (a very stupid one, but a join nevertheless) as
follows, which produces a terrible, but correct, plan:

select value from (SELECT 1::integer as id) as ids JOIN numbers on
(numbers.id = ids.id);
                                   QUERY PLAN
---------------------------------------------------------------------------
Hash Join  (cost=0.02..2195.79 rows=501 width=19)
   Hash Cond: ("outer".id = "inner".id)
   ->  Append  (cost=0.00..1690.50 rows=100051 width=23)
         ->  Seq Scan on numbers  (cost=0.00..0.00 rows=1 width=23)
         ->  Seq Scan on evens numbers (cost=0.00..845.25 rows=50025 width=23)
         ->  Seq Scan on odds numbers (cost=0.00..845.25 rows=50025 width=23)
   ->  Hash  (cost=0.02..0.02 rows=1 width=4)
         ->  Subquery Scan ids  (cost=0.00..0.02 rows=1 width=4)
               ->  Result  (cost=0.00..0.01 rows=1 width=0)




 Now, I subsitute 'evens' for 'numbers', so I am now joining with a normal,
non-inherited table.  The plan is much better:

select value from (SELECT 1::integer as id) as ids JOIN evens on
(evens.id = ids.id);

                              QUERY PLAN
-----------------------------------------------------------------------
 Nested Loop  (cost=0.00..3.05 rows=2 width=19)
   ->  Subquery Scan ids  (cost=0.00..0.02 rows=1 width=4)
         ->  Result  (cost=0.00..0.01 rows=1 width=0)
   ->  Index Scan using ei on evens  (cost=0.00..3.01 rows=1 width=23)
         Index Cond: (evens.id = "outer".id)


I would think that the ideal plan for the first query should be a nested
loop like the second that considers indexes where possible, and it would
look as follows:

            HYPOTHETICAL PLAN
------------------------------------------------------------------------
 Nested Loop
   ->  Subquery Scan ids
         ->  Result
   ->  Append
      ->  Seq Scan on numbers
      ->  Index Scan using ei on evens
            Index Cond: (evens.id = "outer.id")
      ->  Index Scan using oi on odds
            Index Cond: (odds.id = "outer.id")

So.. Why wouldn't postgres use such a plan?  I could think of three
reasons:
    - The planner wasn't considering this plan due to some
      fault of its own
    - That plan makes no sense and would not be able to be run in the
      executor, and therefore it would be wasteful to consider it.
    - It truly is more expensive than a hash join

I've pretty much ruled out the third and I suspect the second is also
untrue (though have not proven that to myself), leaving the first.  If it
is indeed the second, that the plan makes no sense, someone please let me
know!

OK, so I took a look into the optimizer and over time got a better
understanding of what's going on, though I still don't understand it
completely.  Here's my theory as to what's happening:

For this query, most of the path consideration takes place in
match_unsorted_outer() in path/joinpath.c.  For the 'good' query against
the non-inherited 'evens' table, the good plan is generated in the line:

    bestinnerjoin = best_inner_indexscan(...);

Since an inherited table doesn't have one single index over all its
inherited tables, this call produces a null bestinnerjoin.

Later on in match_unsorted_inner(), various access paths are considered
for a nested loop.  One is bestinnerjoin (when it exists), and that is
how the 'good' query gets its nested loop with an index scan.

Other paths considered for inclusion in the nested loop are
'inner_cheapest_total' and 'inner_cheapest_startup';  These plans,
presumably, contain sequential scans, which are expensive enough in the
nested loop that the nested loop plan is suboptimal compared to a hash
join, which is what happens in the 'bad' query that joins against the
inheritor table.

Now, it seems to me as if the solution would be one of the following:

    - create a new bestinnerjoin plan when best_inner_indexscan
      returned null, yet the current relation is an inheritor.
          This bestinnerjoin plan will use the optimal Append plan
          that comprises of the optimal plans for the inherited
          tables that are relevant for the joins (as in the
      case of the hypothetical query above where the append consists
          of a number of index and sequential scans)

    - Consider the optimal Append path relevant to the join
          later on when considering paths for use in a nested loop.
      i.e. inner_cheapest_total or inner_cheapest_startup
      will have to be this good append plan.

For 2, the problem seems to be that in creating the inner_cheapest_total,
joininfo nodes for each child relation do not exist.  I experimented
by just copying the parents joininfo nodes to each child, and it looked
like it was considering index scans somewhere along the line, but the
final plan chosen did not use index scans.  I didn't see where it
failed.   I'm kinda skeptical of 3 anyway, because the
inner_cheapest_total in the 'good' query with out inheritance does not
appear to involve index scans at all.

So.. does anybody have any advice?  Am I totally off base with
my assertions that a better plan can exist when joining inherited
tables?  Does my analysis of the problem and possible solutions in the
optimizer make any sense?  I basically pored over the source code, read
the documentation, and did little tests here and there to arrive at my
conclusions, but cannot claim to have a good global view of how
everything works.

    Thanks very much

        -Aaron


pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Tuning/performance issue...
Next
From: Oleg Lebedev
Date:
Subject: Re: Tuning/performance issue...