Re: UNION ALL on partitioned tables won't use indices. - Mailing list pgsql-hackers

From Robert Haas
Subject Re: UNION ALL on partitioned tables won't use indices.
Date
Msg-id CA+Tgmob1PVQK16JbJOjeHZhqq0vsBpwkNSkD+Up+X+m9YByCuQ@mail.gmail.com
Whole thread Raw
In response to UNION ALL on partitioned tables won't use indices.  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: UNION ALL on partitioned tables won't use indices.  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
List pgsql-hackers
On Thu, Oct 24, 2013 at 6:39 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> Hello, Sorry that it's been a while..
>
> 1. Observed symptom
>
> As you know, UNION ALL accompanied with ORDER BY uses indexes if
> available.
>
>> uniontest=# EXPLAIN SELECT * FROM c11 UNION ALL SELECT * FROM c12 ORDER BY a;
>>                                QUERY PLAN
>> ---------------------------------------------------------------------------
>>  Merge Append  (cost=0.59..10214.10 rows=200000 width=16)
>>    Sort Key: c11.a
>>    ->  Index Scan using ... on c11  (cost=0.29..3857.04 rows=100000 width=16)
>>    ->  Index Scan using ... on c12  (cost=0.29..3857.04 rows=100000 width=16)
>
> So do simple queries on partitioned (inheritance) tables.
>
>> uniontest=# EXPLAIN SELECT * FROM p1 ORDER BY a;
>>                                 QUERY PLAN
>> ------------------------------------------------------------------------------
>>  Merge Append  (cost=0.73..11392.19 rows=200001 width=16)
>>    Sort Key: p1.a
>>    ->  Index Scan using ... on p1  (cost=0.12..8.14 rows=1 width=44)
>>    ->  Index Scan using ... on c11  (cost=0.29..3857.04 rows=100000 width=16)
>>    ->  Index Scan using ... on c12  (cost=0.29..3857.04 rows=100000 width=16)
>
> Nevertheless, UNION ALL on partitioned tables doesn't. This is
> quite unfavourable behavior especially having LIMIT.
>
>>uniontest=# EXPLAIN ANALYZE SELECT * FROM p1
>>            UNION ALL SELECT * FROM p2 ORDER BY a LIMIT 10;
>>                              QUERY PLAN
>>------------------------------------------------------------------------
>> Limit   (actual time=182.732..182.735 rows=10 loops=1)
>>   ->  Sort  (actual time=182.729..182.730 rows=10 loops=1)
>>     Sort Key: p1.a
>>     Sort Method: top-N heapsort  Memory: 25kB
>>     ->  Append  (actual time=0.029..108.109 rows=400000 loops=1)
>>       ->  Seq Scan on p1  (actual time=0.001..0.001 rows=0 loops=1)
>>       ->  Seq Scan on c11  (actual time=0.027..19.074 rows=100000 loops=1)
>>       ->  Seq Scan on c12  (actual time=0.014..16.653 rows=100000 loops=1)
>>       ->  Seq Scan on p2  (actual time=0.000..0.000 rows=0 loops=1)
>>       ->  Seq Scan on c21  (actual time=0.012..16.677 rows=100000 loops=1)
>>       ->  Seq Scan on c22  (actual time=0.012..16.794 rows=100000 loops=1)
>> Total runtime: 182.857 ms
>
>
> 2. The cause
>
> In grouping_planner, flattern_simple_union_all creates
> appendrelinfos for each subqueries then expand_inherited_tables
> furthur expands the parent tables in each subqueries. Finally,
> this sequence leaves following structure. Where rte[2] and [3]
> are subqueries abandoned on the way pulling up rte[4] and [5].
>
> rte[1] Subquery "SELECT*1", inh = 1
>    +- appinfo[0] - rte[4] Relation "p1", inh = 1
>    |               +- appinfo[2] - rte[6]  Relation "p1", inh = 0
>    |               +- appinfo[3] - rte[7]  Relation "c11", inh = 0
>    |               +- appinfo[4] - rte[8]  Relation "c12", inh = 0
>    +- appinfo[1] - rte[5] Relation "p2", inh = 1
>                    +- appinfo[5] - rte[9]  Relation "p1", inh = 0
>                    +- appinfo[6] - rte[10] Relation "c11", inh = 0
>                    +- appinfo[7] - rte[11] Relation "c12", inh = 0
>
> On the other hand, ec member finally has exprs only for varno =
> 1, 4 and 5, in other words, it lacks the members for the most
> descendant RTEs.  This is because add_child_rel_equivalences()
> always inhibits add_eq_member from registering the child_rel's
> relid on EC. Conequently these grandchild relations does not find
> index pathkeys for themselves.
>
>
> 3. Measures
>
> I could thought up three approaches for the problem.
>
> One is to simplly modifying here to give child flag in the
> parameters of add_eq_member accordig to whether the child_rel is
> inh or not. This seems to work fine although leaves two levels of
> MergeAppends (PATCH-1). So the additional patch is attached to
> collapse these MergeAppends (PATCH-2). This gives the same plan
> as PATCH-3.
>
>> uniontest=# explain analyze select * from p1 union all
>>                             select * from p2 order by a limit 10;
>>                                QUERY PLAN
>> ------------------------------------------------------------------------
>>  Limit  (actual time=0.208..0.224 rows=10 loops=1)
>>    ->  Merge Append  (actual time=0.205..0.218 rows=10 loops=1)
>>      Sort Key: p1.a
>>      ->  Merge Append  (actual time=0.110..0.120 rows=10 loops=1)
>>        Sort Key: p1.a
>>        ->  Index Scan .. on p1  (actual time=0.006..0.006 rows=0 loops=1)
>>        ->  Index Scan .. on c11  (actual time=0.054..0.060 rows=10 loops=1)
>>        ->  Index Scan .. on c12  (actual time=0.046..0.046 rows=1 loops=1)
>>      ->  Merge Append  (actual time=0.093..0.093 rows=1 loops=1)
>>        Sort Key: p2.a
>>        ->  Index Scan .. on p2  (actual time=0.002..0.002 rows=0 loops=1)
>>        ->  Index Scan .. on c21  (actual time=0.047..0.047 rows=1 loops=1)
>>        ->  Index Scan .. on c22  (actual time=0.043..0.043 rows=1 loops=1)
>>  Total runtime: 0.448 ms
>
>
> The second is to collapse the appendrel structure shown above to
> have only single level inheritance. This also seems to work
> fine. This makes the plan with single level
> MergeAppend. (PATCH-3)
>
>> uniontest=# EXPLAIN ANALYZE SELECT * FROM p1 UNION ALL
>>             SELECT * FROM p2 ORDER BY a LIMIT 10;
>>                                   QUERY PLAN
>> -------------------------------------------------------------------
>>  Limit  (actual time=0.095..0.107 rows=10 loops=1)
>>    ->  Merge Append  (actual time=0.092..0.102 rows=10 loops=1)
>>      Sort Key: p1.a
>>      ->  Index Scan ... on p1 (actual time=0.005..0.005 rows=0 loops=1)
>>      ->  Index Scan ... on c11  (actual time=0.023..0.030 rows=10 loops=1)
>>      ->  Index Scan ... on c12  (actual time=0.020..0.020 rows=1 lops=1)
>>      ->  Index Scan ... on p2  (actual time=0.001..0.001 rows=0 loops=1)
>>      ->  Index Scan ... on c21  (actual time=0.019..0.019 rows=1 loops=1)
>>      ->  Index Scan ... on c22  (actual time=0.019..0.019 rows=1 loops=1)
>>  Total runtime: 0.241 ms
>
>
>
> The last one is most strait-forward in some sense, and conversely
> is most ad hoc measure. Modifying expand_inherited_rtentry() to
> pull up adding appendrels if needed, using the similar manner to
> PATCH-3. This is added as PATCH-4.
>
>
> what do you think about this?
>
>
> Four patches following are attached to this message.
>
> 1. union_inh_idx_typ1_v1_20131024.patch     .. PATCH-1 described above.
> 2. union_inh_idx_typ1_add_v1_20131024.patch .. PATCH-2 described above.
> 3. union_inh_idx_typ2_v1_20131024.patch     .. PATCH-3 described above.
> 4. union_inh_idx_typ3_v1_20131024.patch     .. PATCH-4 described above.

Please add your patches to the currently-open CommitFest so that we
don't lose track of them.

https://commitfest.postgresql.org/action/commitfest_view/open

I'm not sure which approach to this problem is best, but I agree that
it is worth solving.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: CLUSTER FREEZE
Next
From: Pavel Stehule
Date:
Subject: Re: proposal: lob conversion functionality