Home > mailing lists
Proposal: Partitioning Advisor for PostgreSQL - Mailing list pgsql-hackers

From	Yuzuko Hosoya
Subject	Proposal: Partitioning Advisor for PostgreSQL
Date	May 24, 2018 13:46:17
Msg-id	009901d3f34c$71e1bdc0$55a53940$@lab.ntt.co.jp Whole thread Raw
Responses	Re: Proposal: Partitioning Advisor for PostgreSQL Re: Proposal: Partitioning Advisor for PostgreSQL
List	pgsql-hackers
Tree view
Hello,

I'm Yuzuko Hosoya. This is my first PostgreSQL project participation.

I have been developing partitioning advisor prototype with Julien Rouhaud.
It will be a new feature of HypoPG[1], which is a PostgreSQL extension, and
will help partitioning design tuning.  Currently, HypoPG only supports index
design tuning; it allows users to define hypothetical indexes for real tables and
shows resulting queries' plan/cost with EXPLAIN as if they were actually constructed.
Since declarative partitioning will be greatly improved in PostgreSQL 11 and further
versions, there are emerging needs to support partitioning design tuning. This is
why we are working on partitioning advisor.  We plan to release the first version 
of partitioning advisor for PostgreSQL 11, and then, improve it for PostgreSQL 12.


Overview of partitioning advisor
---------------------------------------
- Partitioning advisor allows users to define multiple hypothetical partitioning
  schemes on real tables and real data
- PostgreSQL can show resulting queries' plan/cost with EXPLAIN using hypothetical
  partitioning schemes
Users can quickly check how their queries would behave if some tables were 
partitioned, and try different partitioning schemes (for instance, to optimize some
queries efficiency Vs. maintenance efficiency).


Partitioning advisor works as follows:

Usage
---------
0. Consider this target table, t1
    #= CREATE TABLE t1 (a int, b text);
    #= INSERT INTO t1 SELECT i, 'test' FROM generate_series(1,299) i ;
    #= EXPLAIN SELECT * FROM t1;
          QUERY PLAN
   -----------------------------------------------------
   Seq Scan on t1  (cost=0.00..4.99 rows=299 width=9)
   (1 row)

1. Partition the target table hypothetically
    #= SELECT * FROM hypopg_partition_table('t1','partition by range(a)');
   The hypopg_partition_table() defines hypothetical range partitioned table 't1' 
   by the partition key 'a' and stores these information into backend local memory.

2. Create hypothetical partitions
    #= SELECT * FROM hypopg_add_partition('t1_1','partition of t1 for values from (1) to (100)');
    #= SELECT * FROM hypopg_add_partition('t1_2','partition of t1 for values from (100) to (300)');
   The hypopg_add_partition() defines hypothetical partitions t1_1 and t1_2 according 
   to their bounds 'from (1) to (100)' and 'from (100) to (300)' respectively, and stores 
   these information into backend local memory.

3. PostgreSQL can show resulting queries' plan/cost with EXPLAIN
    #= EXPLAIN SELECT * FROM t1;
                        QUERY PLAN
   ---------------------------------------------------------------
   Append  (cost=0.00..7.49 rows=299 width=9)
     ->  Seq Scan on t1 t1_1  (cost=0.00..1.99 rows=99 width=9)
     ->  Seq Scan on t1 t1_2  (cost=0.00..4.00 rows=200 width=9)
   (3 rows)
  PostgreSQL retrieves hypothetical partitioning schemes from HypoPG.
  And then if the referred table is defined as hypothetical partitioned table, 
  PostgreSQL creates plans using them.

This is a simple example. In addition, it enables us to simulate range/list/hash
partitioning, partition pruning, N-way join and partition-wise join/aggregation.  
It is already helpful for users to design partitioning schemes.


Current implementation
------------------------
We mainly use get_relation_info_hook().  What we do in this hook is to inject
hypothetical partitioning schemes according to user definition.  At first, we do
all processes that are done at expand_inherited_tables().  Specifically, we 
expand root->simple_rte_array and root->parse->rtable, rewrite target 
table's RangeTblEntry as a partitioned table, and create RangeTblEntries and 
AppendRelInfos for all hypothetical partitions.  Besides that, we set hypothetical
partition's name into rte->alias->aliasname at this time to display hypothetical
partition's name with EXPLAIN.  And then, we rewrite RelOptInfo as needed.  
Specifically, we add partition information, which is set at set_relation_partition_info(),
to hypothetical partitioned tables, and set rel->tuples and rel->pages for 
hypothetical partitions.


However, PostgreSQL isn't designed to have hypothetical tables, so we have 
some problematic blockers for an implementation as follows.  We'd like to 
discuss these topics.

Topics of discussion
---------------------
- Expanding partition's RTE
We have to do all processes which are done at expand_inherited_tables() for 
hypothetical partitions. But, since there are no hooks around here, we use 
get_relation_info_hook() as I mentioned above. In this case, we cannot simulate
update queries correctly, because inheritance_planner() which handles update 
queries is called before get_relation_info_hook().  Therefore, we'd like to see if 
we could add a hook at expand_inherited_tables() to expand hypothetical partitions.

- Showing hypothetical partition's name with EXPLAIN
We set hypothetical partition's name into rte->alias->aliasname as I mentioned
above.  In this case, it is displayed with original table name like this:
   #= EXPLAIN SELECT * FROM t1;
                        QUERY PLAN
    ---------------------------------------------------------------
    Append  (cost=0.00..7.49 rows=299 width=9)
      ->  Seq Scan on t1 t1_1  (cost=0.00..1.99 rows=99 width=9)
      ->  Seq Scan on t1 t1_2  (cost=0.00..4.00 rows=200 width=9)
    (3 rows)
t1 is an original table name and t1_* is hypothetical partition's name. Therefore, 
we'd like to see if we could add a hook to get hypothetical partition's name like 
explain_get_index_name_hook().

- Estimating stats
It is complicated because hypothetical partition has no data.  Currently, we compute
hypothetical partition's size using clauselist_selectivity() according to their partition
bound and original table stats.  As a result, estimate is done with low accuracy, 
especially if there is WHERE clause.  We will improve during developing, but for now, 
we don't have good ideas.

- Reducing extension's code
To define get_relation_info_hook, we have to copy/paste many static functions 
defined in core.  In addition, we have to slightly adapt some functions such as 
RelationGetPartitionDesc() and RelationGetPartitionKey() called from 
set_relation_partition_info(), which retrieve information from system catalogs.  
The reason why we have to adapt is that since objects defined by HypoPG
functions mentioned above doesn't exist in PostgreSQL system catalogs and are
not attached to any existing Relation, so we cannot call RelationGetPartitionDesc()
and RelationGetPartitionKey() as-is.  Specifically, we copy/paste and modify
set_relation_partition_info() and called functions (find_partition_scheme() and
set_baserel_partition_key_exprs()) to get hypothetical partition information.  
Therefore, we'd like to see if we could add hooks to set_relation_partition_info()
to get hypothetical PartitionDesc/PartitionKey and change find_partition_scheme()
and set_baserel_partition_key_exprs() so that we use PartitionKey as the argument 
to these functions, not Relation.


For now, it's a working prototype, and we'd be happy to have some feedback.
And also, we'd be very interested if anyone has good ideas on how to solve the
problems we're facing, and how such a feature could be more integrated into the
core using some new hooks or with other infrastructures.


[1] https://github.com/HypoPG/hypopg

Best regards,
----
Yuzuko Hosoya
NTT Open Source Software Center
pgsql-hackers by date:
From: Christoph Berg
Date: 24 May 2018, 13:08:16
Subject: Re: Shared PostgreSQL libraries and symbol versioning
From: Pavel Raiskup
Date: 24 May 2018, 15:01:00
Subject: Re: Shared PostgreSQL libraries and ABI versioning
Proposal: Partitioning Advisor for PostgreSQL - Mailing list pgsql-hackers

Previous

Next