Re: [HACKERS] Partitioned tables and relfilenode - Mailing list pgsql-hackers
From | Michael Paquier |
---|---|
Subject | Re: [HACKERS] Partitioned tables and relfilenode |
Date | |
Msg-id | CAB7nPqREJRXSTk9fQC2oQgAyQXapeNxjs1SQC2NDS7JOm-FKBQ@mail.gmail.com Whole thread Raw |
In response to | [HACKERS] Partitioned tables and relfilenode (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>) |
Responses |
Re: [HACKERS] Partitioned tables and relfilenode
Re: [HACKERS] Partitioned tables and relfilenode |
List | pgsql-hackers |
On Fri, Feb 10, 2017 at 3:19 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > The new partitioned tables do not contain any data by themselves. Any > data inserted into a partitioned table is routed to and stored in one of > its partitions. In fact, it is impossible to insert *any* data before a > partition (to be precise, a leaf partition) is created. It seems wasteful > then to allocate physical storage (files) for partitioned tables. If we > do not allocate the storage, then we must make sure that the right thing > happens when a command that is intended to manipulate a table's storage > encounters a partitioned table, the "right thing" here being that the > command's code either throws an error or warning (in some cases) if the > specified table is a partitioned table or ignores any partitioned tables > when it reads the list of relations to process from pg_class. Commands > that need to be taught about this are vacuum, analyze, truncate, and alter > table. Specifically: Thanks. I have been looking a bit at this set of patches. > - In case of vacuum, specifying a partitioned table causes a warning > - In case of analyze, we do not throw an error or warning but simply > avoid calling do_analyze_rel() *non-recursively*. Further in > acquire_inherited_sample_rows(), any partitioned tables in the list > returned by find_all_inheritors() are skipped. > - In case of truncate, only the part which manipulates table's physical > storage is skipped for partitioned tables. I am wondering if instead those should not just be errors saying that such operations are just not support. This could be relaxed in the future to mean that a vacuum, truncate or analyze on the partitioned table triggers an operator in cascade. > - ATRewriteTables() skips on the AlteredTableInfo entries for partitioned > tables, because there is nothing to be done. Perhaps that makes sense, foreign tables do that. > - Since we cannot create indexes on partitioned tables anyway, there is > no need to handle cluster and reindex (they throw a meaningful error > already due to the lack of indexes.) Yep. > Patches 0001 and 0002 of the attached implement the above part. 0001 > teaches the above mentioned commands to do the "right thing" as described > above and 0002 teaches heap_create() and heap_create_with_catalog() to not > create any physical storage (none of the forks) for partitioned tables. - else + /* + * Although we cannot analyze partitioned tables themselves, we are + * still be able to do the recursive ANALYZE. + */ + else if (onerel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE) { /* No need for a WARNING if we already complainedduring VACUUM */ if (!(options & VACOPT_VACUUM)) It seems to me that it is better style to use an empty else if with only the comment. Comments related to a condition that are outside this condition should be conditional in their formulations. Comments within a condition can be affirmations when they refer to a decided state of things. From patch 2: @@ -1351,7 +1352,12 @@ heap_create_with_catalog(const char *relname, relkind == RELKIND_TOASTVALUE || relkind == RELKIND_PARTITIONED_TABLE); - heap_create_init_fork(new_rel_desc); + /* + * We do not want to create any storage objects for a partitioned + * table. + */ + if (relkind != RELKIND_PARTITIONED_TABLE) + heap_create_init_fork(new_rel_desc); } This does not make much sense with an assertion telling exactly the contrary a couple of lines before. I think that it would make more sense to move the assertion on relkind directly in heap_create_init_fork(). > Then comes 0003, which concerns inheritance planning. In a regular > inheritance set (ie, the inheritance set corresponding to an inheritance > hierarchy whose root is a regular table), the inheritance parents are > included in their role as child members, because they might contribute > rows to the final result. So AppendRelInfo's are created for each such > table by the planner prep phase, which the later planning steps use to > create a scan plan for those tables as the Append's child plans. > Currently, the partitioned tables are also processed by the optimizer as > inheritance sets. Partitioned table inheritance parents do not own any > storage, so we *must* not create scan plans for them. So we do not need > to process them as child members of the inheritance set. 0003 teaches > expand_inherited_rtentry() to not add partitioned tables as child members. > Also, since the root partitioned table RTE is no longer added to the > Append list as the 1st child member, inheritance_planner() cannot assume > that it can install the 1st child RTE as the nominalRelation of a given > ModifyTable node, instead the original root parent table RTE is installed > as the nominalRelation. This is a collection of checks on relkind == RELKIND_PARTITIONED_TABLE to avoid interactions with partition tables. Did you consider doing something in the executor instead? -- Michael
pgsql-hackers by date: