Thread: [HACKERS] Documentation improvements for partitioning
Here are some patches to improve the documentation about partitioned tables: 0001: Adds some details about partition_bound_spec to the CREATE TABLE page, especially: - a note about inclusivity of range partition bounds, - a note about the UNBOUNDED literal in case of range partitioning, - a note about the NULL literal in case of list partitioning, I wonder if the above "note" should be added under the Notes section or are they fine to be added as part of the description of PARTITION OF clause. Also: - in syntax synopsis, it appears now that any expression is OK to be used for individual bound datum, but it's not true. Only literals are allowed. So fixed that. - added an example showing how to create partitions of a range partitioned table with multiple columns in the partition key - added PARTITION BY and PARTITION OF (FOR VALUES) as PostgreSQL extensions in the compatibility section 0002: Adds details about partitioned tables to the DDL chapter (ddl.sgml) - a new section named "Partitioned Tables" right next to the Inheritance and Partitioning sections is created. - examples are added to the existing Partitioning section using the new partitioned tables. Old text about implementing table partitioning using inheritance is kept, sort of as a still supported older alternative. 0003: Add partitioning keywords to keywords.sgml This is all I have for now. Any feedback is greatly appreciated. Adding this to the next CF. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Fri, Feb 3, 2017 at 4:15 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp > wrote:
Here are some patches to improve the documentation about partitioned tables:
0001: Adds some details about partition_bound_spec to the CREATE TABLE
page, especially:
- a note about inclusivity of range partition bounds,
- a note about the UNBOUNDED literal in case of range partitioning,
- a note about the NULL literal in case of list partitioning,
I wonder if the above "note" should be added under the Notes section or
are they fine to be added as part of the description of PARTITION OF
clause. Also:
- in syntax synopsis, it appears now that any expression is OK to be used
for individual bound datum, but it's not true. Only literals are
allowed. So fixed that.
- added an example showing how to create partitions of a range
partitioned table with multiple columns in the partition key
- added PARTITION BY and PARTITION OF (FOR VALUES) as PostgreSQL
extensions in the compatibility section
0002: Adds details about partitioned tables to the DDL chapter (ddl.sgml)
- a new section named "Partitioned Tables" right next to the
Inheritance and Partitioning sections is created.
- examples are added to the existing Partitioning section using the new
partitioned tables. Old text about implementing table partitioning
using inheritance is kept, sort of as a still supported older
alternative.
0003: Add partitioning keywords to keywords.sgml
This is all I have for now. Any feedback is greatly appreciated. Adding
this to the next CF.
Thanks,
Amit
Patch applies.
Overall this looks really good. It goes a long way towards stating some of the things I had to learn through experimentation.
I had to read a really long way into the patch before finding a blurb that I felt wasn't completely clear:
+
+ <para>
+ <command>INSERT</command> statements with <literal>ON CONFLICT</>
+ clause are currently not allowed on partitioned tables, that is,
+ cause error when specified.
+ </para>
Here's some other tries at saying the same thing, none of which are completely satisfying:
...ON CONFLICT clause are currently not allowed on partitioned tables and will cause an error?
...ON CONFLICT clause are currently not allowed on partitioned tables and will instead cause an error?
...ON CONFLICT clause will currently cause an error if used on a partitioned table?
Here's some other tries at saying the same thing, none of which are completely satisfying:
...ON CONFLICT clause are currently not allowed on partitioned tables and will cause an error?
...ON CONFLICT clause are currently not allowed on partitioned tables and will instead cause an error?
...ON CONFLICT clause will currently cause an error if used on a partitioned table?
As far as additional issues to cover, this bit:
+ <listitem>
+ <para>
+ One cannot drop a <literal>NOT NULL</literal> constraint on a
+ partition's column, if the constraint is present in the parent table.
+ </para>
+ </listitem>
Maybe we should add something about how one would go about dropping a NOT NULL constraint (parent first then partitions?)
In reviewing this patch, do all our target formats make word spacing irrelevant? i.e. is there any point in looking at the number of spaces after a period, etc?
A final note, because I'm really familiar with partitioning on Postgres and other databases, documentation which is clear to me might not be to someone less familiar with partitioning. Maybe we want another reviewer for that?
Hi Corey, On 2017/02/09 6:14, Corey Huinker wrote: > On Fri, Feb 3, 2017 at 4:15 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> > wrote: > >> Here are some patches to improve the documentation about partitioned >> tables: > > Patch applies. > > Overall this looks really good. It goes a long way towards stating some of > the things I had to learn through experimentation. Thanks a lot for taking a look at it. > I had to read a really long way into the patch before finding a blurb that > I felt wasn't completely clear: > > + > + <para> > + <command>INSERT</command> statements with <literal>ON CONFLICT</> > + clause are currently not allowed on partitioned tables, that is, > + cause error when specified. > + </para> > > > Here's some other tries at saying the same thing, none of which are > completely satisfying: > > ...ON CONFLICT clause are currently not allowed on partitioned tables and > will cause an error? > ...ON CONFLICT clause are currently not allowed on partitioned tables and > will instead cause an error? > ...ON CONFLICT clause will currently cause an error if used on a > partitioned table? The last one sounds better. > As far as additional issues to cover, this bit: > > + <listitem> > + <para> > + One cannot drop a <literal>NOT NULL</literal> constraint on a > + partition's column, if the constraint is present in the parent > table. > + </para> > + </listitem> > > Maybe we should add something about how one would go about dropping a NOT > NULL constraint (parent first then partitions?) Dropping it on the parent will cause it to be dropped on the partitions as well. About your point whether we should add a note about how to go about dropping it in the partition, it seems to me it would be out of place here; it's just saying that dropping NOT NULL constraint has a different behavior with partitioned tables than regular inheritance. That note most likely belongs in the ALTER TABLE reference page in the DROP NOT NULL description, so created a patch for that (patch 0004 of the attached patches). > In reviewing this patch, do all our target formats make word spacing > irrelevant? i.e. is there any point in looking at the number of spaces > after a period, etc? It seems to be a convention in the sources to include 2 spaces after a period, which I just try to follow (both in the code comments and SGML). I don't see that spaces are relevant as far as how the targets such as HTML are rendered. > A final note, because I'm really familiar with partitioning on Postgres and > other databases, documentation which is clear to me might not be to someone > less familiar with partitioning. Maybe we want another reviewer for that? More eyeballs will only help make this better. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 10 February 2017 at 07:35, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> A final note, because I'm really familiar with partitioning on Postgres and >> other databases, documentation which is clear to me might not be to someone >> less familiar with partitioning. Maybe we want another reviewer for that? > > More eyeballs will only help make this better. Given that we already have partitioning feature committed, we really need to have the docs committed as well. Without claiming I'm happy about this, I think the best way to improve the number of eyeballs on this is to commit these docs as is. For me, the most important thing is understanding the feature, not (yet) discussing what the docs should look like. This is especially true if other patches reference the way partitioning works and nobody can comment on those patches because they don't understand Any issues with that? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2017/02/10 17:00, Simon Riggs wrote: > On 10 February 2017 at 07:35, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: > >>> A final note, because I'm really familiar with partitioning on Postgres and >>> other databases, documentation which is clear to me might not be to someone >>> less familiar with partitioning. Maybe we want another reviewer for that? >> >> More eyeballs will only help make this better. > > Given that we already have partitioning feature committed, we really > need to have the docs committed as well. > > Without claiming I'm happy about this, I think the best way to improve > the number of eyeballs on this is to commit these docs as is. > > For me, the most important thing is understanding the feature, not > (yet) discussing what the docs should look like. This is especially > true if other patches reference the way partitioning works and nobody > can comment on those patches because they don't understand > > Any issues with that? I agree that getting the proposed documentation changes committed would be a step ahead. I saw in the logical replication thread that dealing with partitioned tables without the docs explaining what they are has been difficult. Hopefully the proposed documentation improvements help make progress in that regard. Partitioned tables, at this point, have certain limitations which affect its interaction with other features (old or new); documenting those limitations will be helpful not only to the users deciding whether to start using the new partitioned tables right away, but also to the developers of other features who want to understand what partitioned tables are. Thanks, Amit
On 10 February 2017 at 08:18, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > I agree that getting the proposed documentation changes committed would be > a step ahead. Committed. I tested how it works and added documentation that matched my experiences, correcting what you'd said and adding more information for clarity as I went. Few points from tests * "ERROR: cannot create index on partitioned table "measurement_year_month"" is misleading because you can create indexes on partitions * You can create additional CHECK (column is NOT NULL) as a check constraint, even if you can't create a not null constraint * The OID inheritance needs work - you shouldn't need to specify a partition needs OIDS if the parent has it already. * There's no tab completion, which prevents people from testing this (maybe better now with some docs) * ERROR: no partition of relation "measurement_year_month" found for row DETAIL: Failing row contains (2016-12-02, 1, 1). should not return the whole row, just the partition keys * It's very weird you can't DROP a partitioned table. I think you need to add dependencies. * ERROR: TO must specify exactly one value per partitioning column should say something more like "you specified one column value, whereas the partitioning key requires two columns" Good work so far, but there is still a very significant amount of work to do. And as this feature evolves it must now contain full documentation at every step, otherwise it won't be able to receive full and fair review. So please make sure each new patch contains docs changes for that patch. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2017/02/10 19:19, Simon Riggs wrote: > On 10 February 2017 at 08:18, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: > >> I agree that getting the proposed documentation changes committed would be >> a step ahead. > > Committed. I tested how it works and added documentation that matched > my experiences, correcting what you'd said and adding more information > for clarity as I went. Thanks for making the improvements to and committing the patch! > Few points from tests > > * "ERROR: cannot create index on partitioned table "measurement_year_month"" > is misleading because you can create indexes on partitions Do you mean that this should not cause an error at all, but create the specified index on partitions as part of running the command? Should the code to handle that be part of this release? Or do you simply mean that we should rewrite the error message and/or add a HINT asking to create the index on partitions instead? > * You can create additional CHECK (column is NOT NULL) as a check > constraint, even if you can't create a not null constraint Sorry but I am not exactly sure which "cannot create a not null constraint" you are referring to here. There are no restrictions on *creating* constraints on partitions, but there are on dropping. Regular inheritance rules prevent dropping inherited constraints (just the same for partitioned tables), of which there are only named CHECK constraints at the moment. A new rule introduced for partitions prevents dropping a column's NOT NULL constraint if it's been "inherited" (i.e., the same constraint is present in the parent partitioned table), although it's not in the same sense as for CHECK constraints, because NOT NULL constraints are not tracked with pg_constraints. > * The OID inheritance needs work - you shouldn't need to specify a > partition needs OIDS if the parent has it already. That sounds right. It's better to keep the behavior same as for regular inheritance. Will post a patch to get rid of the unnecessary check. FWIW, the check was added to prevent the command from succeeding in the case where WITHOUT OIDS has been specified for a partition and the parent partitioned table has the OID column. Regular inheritance simply *overrides* the WITHOUT OIDS specification, which might be seen as surprising. > * There's no tab completion, which prevents people from testing this > (maybe better now with some docs) Will post a patch as well. > * ERROR: no partition of relation "measurement_year_month" found for row > DETAIL: Failing row contains (2016-12-02, 1, 1). > should not return the whole row, just the partition keys I think that makes sense. Something along the lines of BuildIndexValueDescription() for partition keys will be necessary. Will post a patch. > * It's very weird you can't DROP a partitioned table. I think you need > to add dependencies. Do you mean it should be possible to DROP a partitioned table without needing to specify CASCADE? Currently, same thing happens for a partitioned table as will for a inheritance parent. > * ERROR: TO must specify exactly one value per partitioning column > should say something more like "you specified one column value, > whereas the partitioning key requires two columns" Should that be a DETAIL or HINT message? > Good work so far, but there is still a very significant amount of work > to do. And as this feature evolves it must now contain full > documentation at every step, otherwise it won't be able to receive > full and fair review. So please make sure each new patch contains docs > changes for that patch. Agreed that comprehensive documentation of any new feature is crucial both during development and after the feature is released. Thanks, Amit
On 2017/02/13 14:21, Amit Langote wrote: > On 2017/02/10 19:19, Simon Riggs wrote: >> * The OID inheritance needs work - you shouldn't need to specify a >> partition needs OIDS if the parent has it already. > > That sounds right. It's better to keep the behavior same as for regular > inheritance. Will post a patch to get rid of the unnecessary check. > > FWIW, the check was added to prevent the command from succeeding in the > case where WITHOUT OIDS has been specified for a partition and the parent > partitioned table has the OID column. Regular inheritance simply > *overrides* the WITHOUT OIDS specification, which might be seen as surprising. 0001 of the attached patches takes care of this. >> * There's no tab completion, which prevents people from testing this >> (maybe better now with some docs) > > Will post a patch as well. ...and 0002 for this. >> * ERROR: no partition of relation "measurement_year_month" found for row >> DETAIL: Failing row contains (2016-12-02, 1, 1). >> should not return the whole row, just the partition keys > > I think that makes sense. Something along the lines of > BuildIndexValueDescription() for partition keys will be necessary. Will > post a patch. Let me spend a bit more time on this one. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 2017/02/13 14:21, Amit Langote wrote: > On 2017/02/10 19:19, Simon Riggs wrote: >> On 10 February 2017 at 08:18, Amit Langote >> <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> >>> I agree that getting the proposed documentation changes committed would be >>> a step ahead. >> >> Committed. I tested how it works and added documentation that matched >> my experiences, correcting what you'd said and adding more information >> for clarity as I went. > > Thanks for making the improvements to and committing the patch! Since I had added this to CF 2017-03, I have marked it as committed. https://commitfest.postgresql.org/13/983/ Thanks, Amit
On Fri, Feb 10, 2017 at 3:00 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Given that we already have partitioning feature committed, we really > need to have the docs committed as well. Just for the record, it's not like there were no documentation changes in the originally committed patch. In fact there were over 400 lines of documentation: doc/src/sgml/catalogs.sgml | 129 +++++++++++++++++++++++-doc/src/sgml/ref/alter_table.sgml | 117+++++++++++++++++++++-doc/src/sgml/ref/create_foreign_table.sgml | 26 +++++doc/src/sgml/ref/create_table.sgml | 154 +++++++++++++++++++++++++++++ The patch you committed approximately doubles the amount of documentation for this feature, which is fine as far as it goes, but the key points were all explained in the original commit. I have been known to leave out documentation from commits from time to time and fill it in after-the-fact, but that's not really what happened here. > Without claiming I'm happy about this, I think the best way to improve > the number of eyeballs on this is to commit these docs as is. > > For me, the most important thing is understanding the feature, not > (yet) discussing what the docs should look like. This is especially > true if other patches reference the way partitioning works and nobody > can comment on those patches because they don't understand > > Any issues with that? There are a number of things that I think are awkward about the patch as committed: + <listitem> + <para> + See last section for some general information: + <xref linkend="ddl-partitioned-tables"> + </para> + </listitem> I think we generally try to write the documentation in such a way as to minimize backward references, and a backward reference to the previous section seems particularly odd. We've now got section "5.10 Partitioned Tables" followed immediately by section "5.11 Partitioning", where the latter seems to think that you haven't read the former. I think that section 5.11 needs a much heavier rewrite than what it got as part of this patch. It's a hodgepodge of the old content (which explained how to fake partitioning when we didn't have an explicit concept of partitioning) and new content that talks about how the new way is different from the old way. But now that we have the new way, I'm guessing that most people are going to use that and not care about the old way any more. I'm not that it's even appropriate to keep the lengthy explanation of how to fake table partitioning using table inheritance and non-overlapping CHECK constraints, but if we still want that stuff it should be de-emphasized more than it is here. Probably the section should be retitled: in previous releases we called this "Partitioning" because we had no explicit notion of partitioning, but now that we do, it's confusing to have a section called "Partitioning" that explains how to avoid using the partitioning feature, which is more or less what this does. Or maybe the section title should stay the same (or this should be merged into the previous section?) but rewritten to change the emphasis. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Feb 13, 2017 at 5:57 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > On 2017/02/13 14:21, Amit Langote wrote: >> On 2017/02/10 19:19, Simon Riggs wrote: >>> * The OID inheritance needs work - you shouldn't need to specify a >>> partition needs OIDS if the parent has it already. >> >> That sounds right. It's better to keep the behavior same as for regular >> inheritance. Will post a patch to get rid of the unnecessary check. >> >> FWIW, the check was added to prevent the command from succeeding in the >> case where WITHOUT OIDS has been specified for a partition and the parent >> partitioned table has the OID column. Regular inheritance simply >> *overrides* the WITHOUT OIDS specification, which might be seen as surprising. > > 0001 of the attached patches takes care of this. I think 0001 needs to remove this hunk of documentation: <listitem> <para> If the partitioned table specified <literal>WITH OIDS</literal> then each partitionmust also specify <literal>WITH OIDS</literal>. Oids are not automatically inherited by partitions. </para> </listitem> I think 0001 is better than the status quo, but I'm wondering whether we should try to do something slightly different. Maybe it should always work for the child table to specify neither WITH OIDS nor WITHOUT OIDS, but if you do specify one of them then it has to be the one that matches the parent partitioned table? With this patch, IIUC, WITH OIDS is allowed only if the parent has the same, but WITHOUT OIDS is allowed (but ignored) regardless of the parent setting. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Noticed some typos in the documentation. Here's patch to correct those. Sorry, if it has been already taken care of. On Wed, Feb 15, 2017 at 10:01 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Feb 13, 2017 at 5:57 AM, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> On 2017/02/13 14:21, Amit Langote wrote: >>> On 2017/02/10 19:19, Simon Riggs wrote: >>>> * The OID inheritance needs work - you shouldn't need to specify a >>>> partition needs OIDS if the parent has it already. >>> >>> That sounds right. It's better to keep the behavior same as for regular >>> inheritance. Will post a patch to get rid of the unnecessary check. >>> >>> FWIW, the check was added to prevent the command from succeeding in the >>> case where WITHOUT OIDS has been specified for a partition and the parent >>> partitioned table has the OID column. Regular inheritance simply >>> *overrides* the WITHOUT OIDS specification, which might be seen as surprising. >> >> 0001 of the attached patches takes care of this. > > I think 0001 needs to remove this hunk of documentation: > > <listitem> > <para> > If the partitioned table specified <literal>WITH OIDS</literal> then > each partition must also specify <literal>WITH OIDS</literal>. Oids > are not automatically inherited by partitions. > </para> > </listitem> > > I think 0001 is better than the status quo, but I'm wondering whether > we should try to do something slightly different. Maybe it should > always work for the child table to specify neither WITH OIDS nor > WITHOUT OIDS, but if you do specify one of them then it has to be the > one that matches the parent partitioned table? With this patch, IIUC, > WITH OIDS is allowed only if the parent has the same, but WITHOUT OIDS > is allowed (but ignored) regardless of the parent setting. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Wed, Feb 15, 2017 at 4:26 AM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote: > Noticed some typos in the documentation. Here's patch to correct > those. Sorry, if it has been already taken care of. Thanks. That is indeed nonstandard capitalization. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 13 February 2017 at 05:21, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> Few points from tests >> >> * "ERROR: cannot create index on partitioned table "measurement_year_month"" >> is misleading because you can create indexes on partitions > > Do you mean that this should not cause an error at all, but create the > specified index on partitions as part of running the command? Should the > code to handle that be part of this release? Sounds fairly basic to me. If you don't support this, then presumably every ORM, pgAdmin etc will all be broken. And 1000 people will need to write a script that does what we could do easily in a loop internally. At present you haven't even documented how you'd do this. I see you might want to create an index on a subset of partitions, but I expect that you've thought of that and have a proposal for syntax that allows it. Though the default should be that CREATE INDEX just works. > Or do you simply mean that we should rewrite the error message and/or add > a HINT asking to create the index on partitions instead? You could >> * It's very weird you can't DROP a partitioned table. I think you need >> to add dependencies. > > Do you mean it should be possible to DROP a partitioned table without > needing to specify CASCADE? Currently, same thing happens for a > partitioned table as will for a inheritance parent. If we wanted them to act identically we wouldn't have any need for a new feature at all, so clearly that doesn't make sense as an argument. If I issue DROP TABLE elsewhere, it doesn't refuse to drop because it has indexes, sequences etc on it. So why should it just because it has partitions? Most especially if they were created automatically for me in the first place. I might just understand that running ATTACH TABLE might change that viewpoint. I'm pretty sure DROP TABLE and CREATE INDEX are fairly basic expectations from users about how tables should work, partitioned or otherwise. It leaves me asking what else is missing. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 02/15/2017 06:10 AM, Simon Riggs wrote: > On 13 February 2017 at 05:21, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: > If I issue DROP TABLE elsewhere, it doesn't refuse to drop because it > has indexes, sequences etc on it. So why should it just because it has > partitions? Because partitions may have data. JD -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them.
Joshua D. Drake wrote: > On 02/15/2017 06:10 AM, Simon Riggs wrote: > > On 13 February 2017 at 05:21, Amit Langote > > <Langote_Amit_f8@lab.ntt.co.jp> wrote: > > > If I issue DROP TABLE elsewhere, it doesn't refuse to drop because it > > has indexes, sequences etc on it. So why should it just because it has > > partitions? > > Because partitions may have data. So would the table, were it not partitioned. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Feb 15, 2017 at 9:37 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Joshua D. Drake wrote: >> On 02/15/2017 06:10 AM, Simon Riggs wrote: >> > On 13 February 2017 at 05:21, Amit Langote >> > <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> >> > If I issue DROP TABLE elsewhere, it doesn't refuse to drop because it >> > has indexes, sequences etc on it. So why should it just because it has >> > partitions? >> >> Because partitions may have data. > > So would the table, were it not partitioned. True. I think the question here is: do we want to view the dependency between a partitioned table and a partition of that table as DEPENDENCY_NORMAL or as DEPENDENCY_AUTO? With table inheritance, it's always been "normal" and I'm not sure there's any good reason for partitioning to make the opposite decision. The new partitioning implementation provides a user experience that is overall smoother than doing the same thing with inheritance, but it's not as if you can ignore the fact that your partitioned tables have sub-objects that are also tables. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Feb 15, 2017 at 9:10 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> * "ERROR: cannot create index on partitioned table "measurement_year_month"" >>> is misleading because you can create indexes on partitions >> >> Do you mean that this should not cause an error at all, but create the >> specified index on partitions as part of running the command? Should the >> code to handle that be part of this release? > > Sounds fairly basic to me. If you don't support this, then presumably > every ORM, pgAdmin etc will all be broken. I don't see why that should be the case. > And 1000 people will need to write a script that does what we could do > easily in a loop internally. Now that is probably true. > At present you haven't even documented how you'd do this. It's not that hard to figure it out, though. A HINT wouldn't be a bad idea, certainly. There are some really thorny problems with making index creation cascade to all of the partitions. I think it's worth doing, but there's a lot of stuff to think about before you go and start writing code. Most obviously, if you can use a single CREATE INDEX statement to create indexes on all of the partitions, then probably you ought to also be able to use DROP INDEX to get rid of all of those indexes. In other words, it should probably work a lot like what already happens with constraints: constraints cascade from the parent down to the children, but we still know which child object goes with which parent object, so if the parent object is dropped we can get rid of all of the children. I think we need something similar here, although if we restrict it to the partitioning case and don't make it work with table inheritance then it can be simpler since table partitioning doesn't allow multiple inheritance. Presumably we'd want other index commands like REINDEX to cascade similarly. Also, it's not entirely clear what the semantics should be. If the partitioning key is (a) and you ask for an index on (a, b), you could conceivably omit a from the indexes created on partitions that only cover a single value of a. (That case is easy to detect when list partitioning is in use.) Should we try do that elimination, or just do what the user asked for? Will users be unhappy if we try to do this sort of column elimination but it only works in simple cases? Think about the possibility that there are partitioning expressions rather than partitioning columns before concluding we can make it work in all cases. On the other hand, if you ask for a UNIQUE index on (b), should we go ahead and create such an index on each partition, ensuring uniqueness within each partition, or should we refuse to proceed on the grounds that we can't be sure that such an index will ensure global uniqueness? If you do the former, someone might find the behavior surprising, but if you do the latter, you might annoy people who know what they're asking for and want that thing but can't get it. I suspect we want to eventually allow a user to ask for either one, because eventually we'll probably have global indexes, and then you really need a way to say whether you want a global index or a partitioned non-global index. But that requires agreeing on syntax, which is complicated and will probably involve a lot of bikeshedding (as well it should - these are big decisions). I think it would be a bad idea to try to fix this problem for v10. One of the earlier versions of the patch allowed indexes on the parent table as if it were just a regular empty table, which did not seem useful. I asked him to disallow that so as to keep our options open for the future. I see no reason why v11 or v12 can't fill in the functionality in this area. Right now we're 2 weeks away from the start of the last CommitFest, and that's not the time to go start writing a complex patch for a feature that isn't even particularly well-defined. If somebody really cared about this make-an-index-for-everything-in-the-hierarchy problem, they could've written a patch for that at any time in the last 5 years; it's not strictly dependent on the new partitioning stuff. Nobody's done that, and trying to throw together something now in the last couple of weeks could easily end with us getting it wrong and then having to face the unpleasant prospect of either leaving it broken or breaking backward compatibility to fix it. > It leaves me asking what else is missing. There is certainly a lot of room for improvement here but I don't understand your persistent negativity about what's been done thus far. I think it's pretty clearly a huge step forward, and I think Amit deserves a ton of credit for making it happen. The improvements in bulk loading performance alone are stupendous. You apparently have the idea that somebody could have written an even larger patch that solved even more problems at once, but this was already a really big patch, and IMHO quite a good one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas wrote: > On Wed, Feb 15, 2017 at 9:37 AM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: > > Joshua D. Drake wrote: > >> Because partitions may have data. > > > > So would the table, were it not partitioned. > > True. I think the question here is: do we want to view the dependency > between a partitioned table and a partition of that table as > DEPENDENCY_NORMAL or as DEPENDENCY_AUTO? With table inheritance, it's > always been "normal" and I'm not sure there's any good reason for > partitioning to make the opposite decision. I think new-style partitioning is supposed to consider each partition as an implementation detail of the table; the fact that you can manipulate partitions separately does not really mean that they are their own independent object. You don't stop to think "do I really want to drop the TOAST table attached to this main table?" and attach a CASCADE clause if so. You just drop the main table, and the toast one is dropped automatically. I think new-style partitions should behave equivalently. You can make the partition an independent entity, but if you don't explicitly take that step beforehand, I don't see why we should see it that way implicitly. > The new partitioning > implementation provides a user experience that is overall smoother > than doing the same thing with inheritance, but it's not as if you can > ignore the fact that your partitioned tables have sub-objects that are > also tables. Now that partitions are declarative, the underlying implementation could change away from inheritance. It's now just an implementation artifact. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Feb 15, 2017 at 11:34 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > I think new-style partitioning is supposed to consider each partition as > an implementation detail of the table; the fact that you can manipulate > partitions separately does not really mean that they are their own > independent object. You don't stop to think "do I really want to drop > the TOAST table attached to this main table?" and attach a CASCADE > clause if so. You just drop the main table, and the toast one is > dropped automatically. I think new-style partitions should behave > equivalently. That's a reasonable point of view. I'd like to get some more opinions on this topic. I'm happy to have us do whatever most people want, but I'm worried that having table inheritance and table partitioning work differently will be create confusion. I'm also suspicious that there may be some implementation difficulties. On the hand, it does seem a little silly to say that DROP TABLE partitioned_table should always fail except in the degenerate case where there are no partitions, so maybe changing it is for the best. > Now that partitions are declarative, the underlying implementation could > change away from inheritance. It's now just an implementation artifact. I don't really agree with that. It's true that, for example, we could decide to store the inheritance information for partitions someplace other than pg_inherits, but from a user perspective these are always going to be closely-related features. Also, there are quite a number of ways in which it's very noticeable that the objects are separate. They are dumped separately. They have separate indexes, and even if we provide some facility to create a common indexing scheme across all partitions automatically, you'll still be able to REINDEX or CLUSTER one of those tables individually. They can have different storage properties, like one can be UNLOGGED while another is not. They show up in EXPLAIN plans. The partitioning structure affects what you can and can't do with foreign keys. All of those are user-visible things that make this look and feel like a collection of tables, not just a single table that happens to have several relfilenodes under the hood. I think that's actually a really important feature, not a design defect. As you may or may not know, EDB has had a proprietary implementation of table partitioning since Advanced Server 9.1, and one of the things we've learned from that implementation is that users really like to be able to fiddle with the individual partitions. They want to things like make them individually unlogged, rename them, vacuum them, add contraints, add triggers, put them in different tablespaces, vary indexing strategies, all the stuff that you normally do with standalone tables. Any time one of those things didn't work quite like it would for a standalone table, we got complaints. One of the things that has become really clear over the five years that feature has been out of the field is that users value the ability to do different things with different child tables. Now that of course does not mean that they don't want to be able to operate on the hierarchy as a whole; we have numerous feature requests in that direction, too. But, at least among the base of customers that have talked to us about the proprietary partitioning feature in Advanced Server, wanting to treat a partition as a table and do some arbitrary thing to it that can be done to a table is extremely common. Of course, I can't promise (and am not trying to argue) that the reaction among PostgreSQL users generally will necessarily be similar to the experience we've had with our Advanced Server customers, but this experience has definitely caused me to lean in the direction of wanting partitions to be first-class objects. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Robert Haas wrote: >> True. I think the question here is: do we want to view the dependency >> between a partitioned table and a partition of that table as >> DEPENDENCY_NORMAL or as DEPENDENCY_AUTO? With table inheritance, it's >> always been "normal" and I'm not sure there's any good reason for >> partitioning to make the opposite decision. > I think new-style partitioning is supposed to consider each partition as > an implementation detail of the table; the fact that you can manipulate > partitions separately does not really mean that they are their own > independent object. You don't stop to think "do I really want to drop > the TOAST table attached to this main table?" and attach a CASCADE > clause if so. You just drop the main table, and the toast one is > dropped automatically. I think new-style partitions should behave > equivalently. I agree with Alvaro's position. If you need CASCADE to get rid of the individual partitions, that's going to be a serious usability fail. regards, tom lane
On 2017/02/15 13:31, Robert Haas wrote: > On Mon, Feb 13, 2017 at 5:57 AM, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> On 2017/02/13 14:21, Amit Langote wrote: >>> On 2017/02/10 19:19, Simon Riggs wrote: >>>> * The OID inheritance needs work - you shouldn't need to specify a >>>> partition needs OIDS if the parent has it already. >>> >>> That sounds right. It's better to keep the behavior same as for regular >>> inheritance. Will post a patch to get rid of the unnecessary check. >>> >>> FWIW, the check was added to prevent the command from succeeding in the >>> case where WITHOUT OIDS has been specified for a partition and the parent >>> partitioned table has the OID column. Regular inheritance simply >>> *overrides* the WITHOUT OIDS specification, which might be seen as surprising. >> >> 0001 of the attached patches takes care of this. > > I think 0001 needs to remove this hunk of documentation: > > <listitem> > <para> > If the partitioned table specified <literal>WITH OIDS</literal> then > each partition must also specify <literal>WITH OIDS</literal>. Oids > are not automatically inherited by partitions. > </para> > </listitem> Attached updated 0001 which does that. > I think 0001 is better than the status quo, but I'm wondering whether > we should try to do something slightly different. Maybe it should > always work for the child table to specify neither WITH OIDS nor > WITHOUT OIDS, but if you do specify one of them then it has to be the > one that matches the parent partitioned table? With this patch, IIUC, > WITH OIDS is allowed only if the parent has the same, but WITHOUT OIDS > is allowed (but ignored) regardless of the parent setting. With the patch, one can always specify (or not) WITH/WITHOUT OIDS when creating partitions. If WITH OIDS is specified and the parent doesn't have OIDs, then an error is raised. Then just like with normal inheritance, WITHOUT OIDS specification for a partition will be *overridden* if the parent has OIDs. By the way, CREATE TABLE page says this about inheritance and OIDS: (If the new table inherits from any tables that have OIDs, then <literal>OIDS=TRUE</> is forced even if the command says <literal>OIDS=FALSE</>. Hopefully it's clear to someone reading "If the table inherits from any tables ..." that it also refers to creating partition of a partitioned table. Also attaching 0002 (unchanged) for tab-completion support for the new partitioning syntax. 0003 changes how ExecFindPartition() shows the row for which get_partition_for_tuple() failed to find a partition. As Simon commented upthread, we should show just the partition key, not the whole row in the error DETAIL. So the DETAIL now looks like what's shown by _bt_check_unique() upon uniqueness violation: DETAIL: Partition key of the failing row contains (key1, key2, ...)=(val1, val2, ...) The rules about which columns to show or whether to show the DETAIL at all are similar to those in BuildIndexValueDescription(): - if user has SELECT privilege on the whole table, simply go ahead - if user doesn't have SELECT privilege on the table, check that they can see all the columns in the key (no point in showing partial key); however abort on finding an expression for which we don't try finding out privilege situation of whatever columns may be in the expression Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 2017/02/16 2:08, Robert Haas wrote: > On Wed, Feb 15, 2017 at 11:34 AM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: >> I think new-style partitioning is supposed to consider each partition as >> an implementation detail of the table; the fact that you can manipulate >> partitions separately does not really mean that they are their own >> independent object. You don't stop to think "do I really want to drop >> the TOAST table attached to this main table?" and attach a CASCADE >> clause if so. You just drop the main table, and the toast one is >> dropped automatically. I think new-style partitions should behave >> equivalently. > > That's a reasonable point of view. I'd like to get some more opinions > on this topic. I'm happy to have us do whatever most people want, but > I'm worried that having table inheritance and table partitioning work > differently will be create confusion. I'm also suspicious that there > may be some implementation difficulties. On the hand, it does seem a > little silly to say that DROP TABLE partitioned_table should always > fail except in the degenerate case where there are no partitions, so > maybe changing it is for the best. So I count more than a few votes saying that we should be able to DROP partitioned tables without specifying CASCADE. I tried to implement that using the attached patch by having StoreCatalogInheritance1() create DEPENDENCY_AUTO dependency between parent and child if the child is a partition, instead of DEPENDENCY_NORMAL that would otherwise be created. Now it seems that that is one way of making sure that partitions are dropped when the root partitioned table is dropped, not sure if the best; why create the pg_depend entries at all one might ask. I chose it for now because that's the one with fewest lines of change. Adjusted regression tests as well, since we recently tweaked tests [1] to work around the irregularities of test output when using CASCADE. Thanks, Amit [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=c397814 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
In the documentation of ALTER TABLE ... ATTACH PARTITION its implicit that partition name specified should be the name of the existing table being attached. Same is the case with DETACH PARTITION where its implicit that the detached partition becomes a stand-alone table with same name as the partition being detached. I think this needs to be more explicit. PFA patch on those lines. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Forgot to attach the patch. Thanks Rajkumar for notifying me. On Fri, Feb 17, 2017 at 11:18 AM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote: > In the documentation of ALTER TABLE ... ATTACH PARTITION its implicit > that partition name specified should be the name of the existing table > being attached. Same is the case with DETACH PARTITION where its > implicit that the detached partition becomes a stand-alone table with > same name as the partition being detached. I think this needs to be > more explicit. PFA patch on those lines. > > -- > Best Wishes, > Ashutosh Bapat > EnterpriseDB Corporation > The Postgres Database Company -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Fri, Feb 17, 2017 at 11:25 AM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote: > Forgot to attach the patch. Thanks Rajkumar for notifying me. I think this is overexplaining what is anyway obvious. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Feb 16, 2017 at 7:15 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> I think 0001 is better than the status quo, but I'm wondering whether >> we should try to do something slightly different. Maybe it should >> always work for the child table to specify neither WITH OIDS nor >> WITHOUT OIDS, but if you do specify one of them then it has to be the >> one that matches the parent partitioned table? With this patch, IIUC, >> WITH OIDS is allowed only if the parent has the same, but WITHOUT OIDS >> is allowed (but ignored) regardless of the parent setting. > > With the patch, one can always specify (or not) WITH/WITHOUT OIDS when > creating partitions. If WITH OIDS is specified and the parent doesn't > have OIDs, then an error is raised. Then just like with normal > inheritance, WITHOUT OIDS specification for a partition will be > *overridden* if the parent has OIDs. By the way, CREATE TABLE page says > this about inheritance and OIDS: > > (If the new table inherits from any tables that have OIDs, then > <literal>OIDS=TRUE</> is forced even if the command says > <literal>OIDS=FALSE</>. > > Hopefully it's clear to someone reading "If the table inherits from any > tables ..." that it also refers to creating partition of a partitioned table. I rewrote this to be a bit more explicit and committed it. I noticed that there was some duplication here: the old text said both this: A partition cannot have columns other than those inherited from the parent. And also this just a little later in the same page: A partition must have the same column names and types as the table of which it is a partition. The second wording seemed better, so I moved that statement a little higher up and rejiggered the wording to be super-clear about OIDs. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Feb 16, 2017 at 12:43 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > On 2017/02/16 2:08, Robert Haas wrote: >> On Wed, Feb 15, 2017 at 11:34 AM, Alvaro Herrera >> <alvherre@2ndquadrant.com> wrote: >>> I think new-style partitioning is supposed to consider each partition as >>> an implementation detail of the table; the fact that you can manipulate >>> partitions separately does not really mean that they are their own >>> independent object. You don't stop to think "do I really want to drop >>> the TOAST table attached to this main table?" and attach a CASCADE >>> clause if so. You just drop the main table, and the toast one is >>> dropped automatically. I think new-style partitions should behave >>> equivalently. >> >> That's a reasonable point of view. I'd like to get some more opinions >> on this topic. I'm happy to have us do whatever most people want, but >> I'm worried that having table inheritance and table partitioning work >> differently will be create confusion. I'm also suspicious that there >> may be some implementation difficulties. On the hand, it does seem a >> little silly to say that DROP TABLE partitioned_table should always >> fail except in the degenerate case where there are no partitions, so >> maybe changing it is for the best. > > So I count more than a few votes saying that we should be able to DROP > partitioned tables without specifying CASCADE. > > I tried to implement that using the attached patch by having > StoreCatalogInheritance1() create DEPENDENCY_AUTO dependency between > parent and child if the child is a partition, instead of DEPENDENCY_NORMAL > that would otherwise be created. Now it seems that that is one way of > making sure that partitions are dropped when the root partitioned table is > dropped, not sure if the best; why create the pg_depend entries at all one > might ask. I chose it for now because that's the one with fewest lines of > change. Adjusted regression tests as well, since we recently tweaked > tests [1] to work around the irregularities of test output when using CASCADE. Could you possibly post this on a new thread with a reference back to this one? The number of patches on this one is getting a bit hard to track, and some people may be under the misimpression that this one is just about documentation. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 15 February 2017 at 15:46, Robert Haas <robertmhaas@gmail.com> wrote: >> It leaves me asking what else is missing. > > There is certainly a lot of room for improvement here but I don't > understand your persistent negativity about what's been done thus far. > I think it's pretty clearly a huge step forward, and I think Amit > deserves a ton of credit for making it happen. The improvements in > bulk loading performance alone are stupendous. You apparently have > the idea that somebody could have written an even larger patch that > solved even more problems at once, but this was already a really big > patch, and IMHO quite a good one. Please explain these personal comments against me. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2017/02/20 1:04, Robert Haas wrote: > On Thu, Feb 16, 2017 at 12:43 PM, Amit Langote wrote: >> So I count more than a few votes saying that we should be able to DROP >> partitioned tables without specifying CASCADE. >> >> I tried to implement that using the attached patch by having >> StoreCatalogInheritance1() create DEPENDENCY_AUTO dependency between >> parent and child if the child is a partition, instead of DEPENDENCY_NORMAL >> that would otherwise be created. Now it seems that that is one way of >> making sure that partitions are dropped when the root partitioned table is >> dropped, not sure if the best; why create the pg_depend entries at all one >> might ask. I chose it for now because that's the one with fewest lines of >> change. Adjusted regression tests as well, since we recently tweaked >> tests [1] to work around the irregularities of test output when using CASCADE. > > Could you possibly post this on a new thread with a reference back to > this one? The number of patches on this one is getting a bit hard to > track, and some people may be under the misimpression that this one is > just about documentation. Sorry about that. Sent the above message and the patch in a new thread titled "dropping partitioned tables without CASCADE" [1]. Thanks, Amit [1] https://postgr.es/m/6c420206-45d7-3f56-8325-4bd7b76483ba%40lab.ntt.co.jp
On Mon, Feb 20, 2017 at 2:09 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 15 February 2017 at 15:46, Robert Haas <robertmhaas@gmail.com> wrote: >>> It leaves me asking what else is missing. >> >> There is certainly a lot of room for improvement here but I don't >> understand your persistent negativity about what's been done thus far. >> I think it's pretty clearly a huge step forward, and I think Amit >> deserves a ton of credit for making it happen. The improvements in >> bulk loading performance alone are stupendous. You apparently have >> the idea that somebody could have written an even larger patch that >> solved even more problems at once, but this was already a really big >> patch, and IMHO quite a good one. > > Please explain these personal comments against me. Several of your emails, including your first post to this thread, seemed to me to be quite negative about the state of this feature. I don't think that's warranted, though perhaps I am misreading your tone, as I have been known to do. I also don't think that expressing the opinion that the feature is better than you're giving it credit for is a personal comment against you. Where exactly do you see a personal comment against you in what I wrote? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Feb 15, 2017 at 12:08:19PM -0500, Robert Haas wrote: > On Wed, Feb 15, 2017 at 11:34 AM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: > > I think new-style partitioning is supposed to consider each partition as > > an implementation detail of the table; the fact that you can manipulate > > partitions separately does not really mean that they are their own > > independent object. You don't stop to think "do I really want to drop > > the TOAST table attached to this main table?" and attach a CASCADE > > clause if so. You just drop the main table, and the toast one is > > dropped automatically. I think new-style partitions should behave > > equivalently. > > That's a reasonable point of view. I'd like to get some more opinions > on this topic. I'm happy to have us do whatever most people want, but > I'm worried that having table inheritance and table partitioning work > differently will be create confusion. I'm also suspicious that there > may be some implementation difficulties. On the hand, it does seem a > little silly to say that DROP TABLE partitioned_table should always > fail except in the degenerate case where there are no partitions, so > maybe changing it is for the best. I think we have a precedent for this, and that is the SERIAL data type, which is really just a macro on top of CREATE SEQUENCE and DEFAULT nextval() using the sequence. You can manipulate the sequence and the DEFAULT separately, but if you drop the table the sequence is dropped to automatically. This seems like an instructive example of how we have bundled behavior together in the past in a logical and easy-to-understand way. Of course, their might be some technical limitations that prevent us from using this approach, and I would be interested in hearing those details. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On Mon, Feb 20, 2017 at 02:37:44PM +0530, Robert Haas wrote: > On Mon, Feb 20, 2017 at 2:09 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > > On 15 February 2017 at 15:46, Robert Haas <robertmhaas@gmail.com> wrote: > >>> It leaves me asking what else is missing. > >> > >> There is certainly a lot of room for improvement here but I don't > >> understand your persistent negativity about what's been done thus far. > >> I think it's pretty clearly a huge step forward, and I think Amit > >> deserves a ton of credit for making it happen. The improvements in > >> bulk loading performance alone are stupendous. You apparently have > >> the idea that somebody could have written an even larger patch that > >> solved even more problems at once, but this was already a really big > >> patch, and IMHO quite a good one. > > > > Please explain these personal comments against me. > > Several of your emails, including your first post to this thread, > seemed to me to be quite negative about the state of this feature. I > don't think that's warranted, though perhaps I am misreading your > tone, as I have been known to do. I also don't think that expressing > the opinion that the feature is better than you're giving it credit > for is a personal comment against you. Where exactly do you see a > personal comment against you in what I wrote? I have to admit my reaction was similar to Simon's, meaning that the lack of docs is a problem, and that the limitations are kind of a surprise, and I wonder what other surprises there are. I am thinking this is a result of small teams, often from the same company, working on a features in isolation and then making them public. It is often not clear what decisions were made and why. The idea that unique indexes on a parent table can't guarantee uniqueness across child tables is both a surprise, and obvious once stated. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On 2017/02/16 10:45, Amit Langote wrote: > Also attaching 0002 (unchanged) for tab-completion support for the new > partitioning syntax. Robert already spawned a new thread titled "tab completion for partitioning" for this [0]. > 0003 changes how ExecFindPartition() shows the row for which > get_partition_for_tuple() failed to find a partition. As Simon commented > upthread, we should show just the partition key, not the whole row in the > error DETAIL. So the DETAIL now looks like what's shown by > _bt_check_unique() upon uniqueness violation: > > DETAIL: Partition key of the failing row contains (key1, key2, ...)=(val1, > val2, ...) > > The rules about which columns to show or whether to show the DETAIL at all > are similar to those in BuildIndexValueDescription(): > > - if user has SELECT privilege on the whole table, simply go ahead > > - if user doesn't have SELECT privilege on the table, check that they > can see all the columns in the key (no point in showing partial key); > however abort on finding an expression for which we don't try finding > out privilege situation of whatever columns may be in the expression I posted this patch in a new thread titled "error detail when partition not found" [1]. Thanks, Amit [0] https://www.postgresql.org/message-id/CA+TgmobYOj=A8GesiEs_V2Wq46-_w0+7MOwPiNWC+iuzJ-uWjA@mail.gmail.com [1] https://www.postgresql.org/message-id/9f9dc7ae-14f0-4a25-5485-964d9bfc19bd%40lab.ntt.co.jp
On Tue, Feb 21, 2017 at 2:03 AM, Bruce Momjian <bruce@momjian.us> wrote: > On Mon, Feb 20, 2017 at 02:37:44PM +0530, Robert Haas wrote: >> On Mon, Feb 20, 2017 at 2:09 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> > On 15 February 2017 at 15:46, Robert Haas <robertmhaas@gmail.com> wrote: >> >>> It leaves me asking what else is missing. >> >> >> >> There is certainly a lot of room for improvement here but I don't >> >> understand your persistent negativity about what's been done thus far. >> >> I think it's pretty clearly a huge step forward, and I think Amit >> >> deserves a ton of credit for making it happen. The improvements in >> >> bulk loading performance alone are stupendous. You apparently have >> >> the idea that somebody could have written an even larger patch that >> >> solved even more problems at once, but this was already a really big >> >> patch, and IMHO quite a good one. >> > >> > Please explain these personal comments against me. >> >> Several of your emails, including your first post to this thread, >> seemed to me to be quite negative about the state of this feature. I >> don't think that's warranted, though perhaps I am misreading your >> tone, as I have been known to do. I also don't think that expressing >> the opinion that the feature is better than you're giving it credit >> for is a personal comment against you. Where exactly do you see a >> personal comment against you in what I wrote? > > I have to admit my reaction was similar to Simon's, meaning that the > lack of docs is a problem, and that the limitations are kind of a > surprise, and I wonder what other surprises there are. > > I am thinking this is a result of small teams, often from the same > company, working on a features in isolation and then making them public. I agree that this is result of small teams. The partitioning feature encompasses features like global indexes, which is large in themselves. Usually, in a company many teams would be working on different sub-features in the same release schedule. But that's not the case here. We have to add sub-features in multiple releases. That might be the reason behind some of the current limitations. > It is often not clear what decisions were made and why. Amit Langote submitted the patch sometime in August 2015, which certainly didn't look like a well-thought design, certainly not a product of 'long cooking' within his company. It was more experimental. (Obviously he had background of many earlier discussions on partitioning, that all happened on hackers.) Since then all the discussion is on hackers; all decisions were made during the discussion. While what you are saying may be true with other patches, I am not sure whether it's true with this work. > The idea that > unique indexes on a parent table can't guarantee uniqueness across child > tables is both a surprise, and obvious once stated. I think, that's a limitation till we implement global indexes. But nothing in the current implementation stops us from implementing it. In fact, I remember, a reply from Robert to another thread on partitioning started in parallel to Amit's thread had explained some of these design decisions. I am unable to find link to that exact reply though. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
On 2017/02/21 12:10, Ashutosh Bapat wrote: > I think, that's a limitation till we implement global indexes. But > nothing in the current implementation stops us from implementing it. > In fact, I remember, a reply from Robert to another thread on > partitioning started in parallel to Amit's thread had explained some > of these design decisions. I am unable to find link to that exact > reply though. Are you perhaps thinking of the message titled "design for a partitioning feature (was: inheritance)" a few months back: https://www.postgresql.org/message-id/CA+TgmoZeV0nryvw9cNB81xdOJg4XtpJMKDif0zTo-GdLcOCgcQ@mail.gmail.com Thanks, Amit
On Tue, Feb 21, 2017 at 9:57 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > On 2017/02/21 12:10, Ashutosh Bapat wrote: >> I think, that's a limitation till we implement global indexes. But >> nothing in the current implementation stops us from implementing it. >> In fact, I remember, a reply from Robert to another thread on >> partitioning started in parallel to Amit's thread had explained some >> of these design decisions. I am unable to find link to that exact >> reply though. > > Are you perhaps thinking of the message titled "design for a partitioning > feature (was: inheritance)" a few months back: > > https://www.postgresql.org/message-id/CA+TgmoZeV0nryvw9cNB81xdOJg4XtpJMKDif0zTo-GdLcOCgcQ@mail.gmail.com Thanks a lot, that's the one I was searching for. And now that I confirm it, there's NO reply to Robert's mail. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Hi, There is a very small typo that a comma is missing. Attached is the patch to fix it. On Wed, 15 Feb 2017 07:57:53 -0500 Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Feb 15, 2017 at 4:26 AM, Ashutosh Bapat > <ashutosh.bapat@enterprisedb.com> wrote: > > Noticed some typos in the documentation. Here's patch to correct > > those. Sorry, if it has been already taken care of. > > Thanks. That is indeed nonstandard capitalization. Committed. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Yugo Nagata <nagata@sraoss.co.jp> -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Tue, Feb 21, 2017 at 2:03 AM, Bruce Momjian <bruce@momjian.us> wrote: > I have to admit my reaction was similar to Simon's, meaning that the > lack of docs is a problem, and that the limitations are kind of a > surprise, and I wonder what other surprises there are. Did you read my message upthread pointing out that the initial commit contained hundreds of lines of documentation? I agree that it would be bad if table partitioning got committed with no documentation, but that did not happen. > I am thinking this is a result of small teams, often from the same > company, working on a features in isolation and then making them public. > It is often not clear what decisions were made and why. That also did not happen, or at least certainly not with this patch. All of the discussion was public and on the mailing list. I never communicated with Amit Langote off-list about this work, except shortly before I committed his patches I added him on Skype and gave him a heads up that I was planning to do so real soon. At no time have the two of us worked for the same company. Also, the patch had 7 other reviewers credited in the commit messages spread across, I think, 4 different companies. I think the issue here might be that with this feature, as with some other features I've committed over the last few years, the email discussion got very long. On the one hand, that does make it hard for others to keep up, but not because anything is secret, only because reading hundreds of email messages takes a lot of time. However, the large number of emails on a public mailing list makes it absolutely clear that this wasn't developed in isolation and presented as a done deal. It was written and rewritten multiple times in response to feedback, not only from me but from other people who did take the time to keep up with the discussion. As Ashutosh Bapat and Amit Langote already pointed out, I even posted (on a separate thread with a clear subject line) some thoughts about the overall design of this feature, in response to concerns articulated on an unrelated thread by Tom and Alvaro. I did that in an attempt to give people a separate thread on which to discuss those issues - without having to dive into the main thread where things were being hashed out in detail - but it got no responses, either because the logic was unarguable or because nobody took the time to write back. I think there's certainly room to criticize cases where a feature is designed, developed, and committed almost entirely by people who work at a single company, or where a significant amount of discussion happens off-list, but it would be difficult to find a more clear-cut case of where that DIDN'T happen than this patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Feb 21, 2017 at 10:27 PM, Yugo Nagata <nagata@sraoss.co.jp> wrote: > There is a very small typo that a comma is missing. > Attached is the patch to fix it. Thank you. I have committed your patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Feb 21, 2017 at 6:14 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Feb 21, 2017 at 2:03 AM, Bruce Momjian <bruce@momjian.us> wrote: >> I have to admit my reaction was similar to Simon's, meaning that the >> lack of docs is a problem, and that the limitations are kind of a >> surprise, and I wonder what other surprises there are. > > Did you read my message upthread pointing out that the initial commit > contained hundreds of lines of documentation? I agree that it would > be bad if table partitioning got committed with no documentation, but > that did not happen. > >> I am thinking this is a result of small teams, often from the same >> company, working on a features in isolation and then making them public. >> It is often not clear what decisions were made and why. > > That also did not happen, or at least certainly not with this patch. > All of the discussion was public and on the mailing list. FWIW, I agree that some of what has been claimed about what contributors failed to do with this patch is exaggerated, and not in a way that I'd understand as hyperbole that drives home a deeper point. I'm not the slightest bit surprised at the limitations that this feature has, even if Bruce and Simon are. The documentation needs work, and perhaps the feature itself needs a small tweak here or there. Just not to a particularly notable degree, given the point we are in in the release cycle. -- Peter Geoghegan
On 22 February 2017 at 19:57, Peter Geoghegan <pg@bowt.ie> wrote: > FWIW, I agree that some of what has been claimed about what > contributors failed to do with this patch is exaggerated, and not in a > way that I'd understand as hyperbole that drives home a deeper point. What claims are you talking about? Which things have been exaggerated, and by whom? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 22 February 2017 at 02:14, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Feb 21, 2017 at 2:03 AM, Bruce Momjian <bruce@momjian.us> wrote: >> I have to admit my reaction was similar to Simon's, meaning that the >> lack of docs is a problem, and that the limitations are kind of a >> surprise, and I wonder what other surprises there are. > > Did you read my message upthread pointing out that the initial commit > contained hundreds of lines of documentation? I agree that it would > be bad if table partitioning got committed with no documentation, but > that did not happen. You seem a little defensive about some reasonable review comments. While its true that the patch had syntax documentation, there was no user design documentation which explained how it worked to allow objective review. Had I been able to provide input without reading every email message, I would have done so earlier. Amit, The features I consider very important in the first release are 1. Declarative syntax (we have this!) 2. Tuple routing on INSERT/COPY (we have this!) 3. O(1) partition elimination for simple = queries 4. PRIMARY KEY defined using a) distinct set of UNIQUE constraints on each partition, b) partition key 2 and 3 are intimately connected because they would both use the same in-memory data for bsearch, so the code should be almost identical. 4 is important for Foreign Keys and Logical Replication As missing items, features 3 and 4 seem achievable in this release, potentially in restricted form. I think we should probably avoid trying to UPDATE rows from one partition to another in this release, since that seems likely to be buggy and seems like would only be needed in relatively few cases. Let me know if I can help with these. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Feb 23, 2017 at 8:08 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > What claims are you talking about? Which things have been exaggerated, > and by whom? * The specious argument that we should "just" have CREATE INDEX create equivalent indexes across partitions, to save all those people from writing all those scripts. The reality is that there are several ways that that would not comport with the existing design of things. Most obviously: where does that leave the design of global indexes that we eventually come up with? Now, maybe there is a good way to make this easier for users, worth doing sooner rather than later, but that will be subtle, and you should at least acknowledge that. * "It leaves me asking what else is missing"... "If we wanted them to act identically we wouldn't have any need for a new feature at all, so clearly that doesn't make sense as an argument." These remarks sound imperious to me. I think that this could be quite demoralizing to someone in Amit's position, and you ought to give some consideration to that. I think that several of your remarks on the patch are facile and/or needlessly ambiguous, which is what makes this lack of tact seem unfair to me. * "Good work so far, but there is still a very significant amount of work to do." There is always more work to do, so why say so? I think that the implication is that this isn't complete as a feature that goes into the next release, which I disagree with. There is nothing disappointing to me about this feature, and, as I said, I am unsurprised that it doesn't support certain things. -- Peter Geoghegan
On 02/23/2017 09:27 AM, Peter Geoghegan wrote: > On Thu, Feb 23, 2017 at 8:08 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > * "Good work so far, but there is still a very significant amount of > work to do." > > There is always more work to do, so why say so? I think that the > implication is that this isn't complete as a feature that goes into > the next release, which I disagree with. There is nothing > disappointing to me about this feature, and, as I said, I am > unsurprised that it doesn't support certain things. > I don't think we need to start going down the avenue of "you could be nicer". We can all be nicer and we all have our good and bad days. If we start worrying about egos to this degree, we will never get anything done. JD -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them. Unless otherwise stated, opinions are my own.
On 23 February 2017 at 17:27, Peter Geoghegan <pg@bowt.ie> wrote: > On Thu, Feb 23, 2017 at 8:08 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> What claims are you talking about? Which things have been exaggerated, >> and by whom? > > * The specious argument that we should "just" have CREATE INDEX create > equivalent indexes across partitions, to save all those people from > writing all those scripts. > > The reality is that there are several ways that that would not comport > with the existing design of things. Most obviously: where does that > leave the design of global indexes that we eventually come up with? > Now, maybe there is a good way to make this easier for users, worth > doing sooner rather than later, but that will be subtle, and you > should at least acknowledge that. My argument was that CREATE INDEX is expected to just work on tables at present, so should also just work on partitioned tables. Without that, the reality is people will need to write scripts. I don't see how that relates to the desire for multiple index options, since one of them would need to be the default and we could provide one in this release, one in the next etc.. > * "It leaves me asking what else is missing"... "If we wanted them to > act identically we wouldn't have any need for a new feature at all, so > clearly that doesn't make sense as an argument." > > These remarks sound imperious to me. I think that this could be quite > demoralizing to someone in Amit's position, and you ought to give some > consideration to that. I think that several of your remarks on the > patch are facile and/or needlessly ambiguous, which is what makes this > lack of tact seem unfair to me. The current design has assumed many things, leading me to question what else has been assumed. Challenging those assumptions is important and has been upheld. I agree my review comments could well be demoralizing, which is why I said "Good work so far". It takes a while to realise that review comments are given to patch authors with the intent to help improve the product, not as personal attacks. I thought you would know that by now. Imperious? No, definitely a Jedi. > * "Good work so far, but there is still a very significant amount of > work to do." > > There is always more work to do, so why say so? I think that the > implication is that this isn't complete as a feature that goes into > the next release, which I disagree with. I've seen many patches rejected because they do not contain the desired feature set, yet. ISTM my opinion on when that is reached is as valid as yours or anyone else's, so I'm unclear as to your issue. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Feb 23, 2017 at 11:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > My argument was that CREATE INDEX is expected to just work on tables > at present, so should also just work on partitioned tables. Without > that, the reality is people will need to write scripts. Really? What about postgres_fdw? Even if that counter-example didn't exist, I'd still disagree. People may expect that CREATE INDEX should just work, but that expectation is not as general as you suggest. Otherwise, I doubt we'd be talking about it at all. > I don't see how that relates to the desire for multiple index options, > since one of them would need to be the default and we could provide > one in this release, one in the next etc.. You didn't say any of that until now. And besides, I think that global indexes make a lot more sense as a default. You seem to be saying that a simple CREATE INDEX could be interpreted as meaning one or the other of those two behaviors just as easily (global index vs. create an index on each partition). I don't think it's a good idea to try to meet any general "just works" expectation if what you actually get does not fit the intuition of almost all users. "Just don't show me an error" seems like a bad design goal, especially for a utility statement. > The current design has assumed many things, leading me to question > what else has been assumed. > > Challenging those assumptions is important and has been upheld. I agree. The irony is that in so doing, you yourself make your own assumptions, confusing everyone, and making it harder to act on your feedback. You did make some reasonable points, IMV. > I've seen many patches rejected because they do not contain the > desired feature set, yet. Obviously that general principle is not under discussion. My point, of course, was that it seems pretty clear to me that this is on the right side of that fence. -- Peter Geoghegan
On Thu, Feb 23, 2017 at 10:00 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > You seem a little defensive about some reasonable review comments. I am prone to that from time to time, and this may be an instance of it. > While its true that the patch had syntax documentation, there was no > user design documentation which explained how it worked to allow > objective review. Had I been able to provide input without reading > every email message, I would have done so earlier. But I don't agree that it was impossible for you to provide input earlier without reading every email message, nor do I agree that it is unreasonable to expect to people who want to provide input to read the relevant threads. > The features I consider very important in the first release are > 1. Declarative syntax (we have this!) > 2. Tuple routing on INSERT/COPY (we have this!) > 3. O(1) partition elimination for simple = queries > 4. PRIMARY KEY defined using a) distinct set of UNIQUE constraints on > each partition, b) partition key > > 2 and 3 are intimately connected because they would both use the same > in-memory data for bsearch, so the code should be almost identical. > > 4 is important for Foreign Keys and Logical Replication > > As missing items, features 3 and 4 seem achievable in this release, > potentially in restricted form. Simon, this is ridiculous. We're 4 or 5 days away from the start of the last CommitFest. We have time to fix bugs and improve documentation and maybe tweak things here and there, but 3 and 4 are significant development projects. There isn't time to do that stuff right now and get it right. You don't get to show up more than two months after the feature is committed and start complaining that it doesn't include all the things you want. Which things ought to be included in the initial patch was under active discussion about a year ago this time. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2/23/17 8:36 PM, Robert Haas wrote: > We're 4 or 5 days away from the start of > the last CommitFest. We have time to fix bugs and improve > documentation and maybe tweak things here and there, but 3 and 4 are > significant development projects. There isn't time to do that stuff > right now and get it right. It might be possible to provide some temporary work-arounds for some of this, which would be nice. But I agree that there's definitely not enough time to implement *good* solutions to even just automatic index creation, let alone somehow handling uniqueness. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
On Fri, Feb 24, 2017 at 8:06 AM, Robert Haas <robertmhaas@gmail.com> wrote: > Simon, this is ridiculous. We're 4 or 5 days away from the start of > the last CommitFest. We have time to fix bugs and improve > documentation and maybe tweak things here and there, but 3 and 4 are > significant development projects. There isn't time to do that stuff > right now and get it right. You don't get to show up more than two > months after the feature is committed and start complaining that it > doesn't include all the things you want. Which things ought to be > included in the initial patch was under active discussion about a year > ago this time. I take that back. You can complain as much as you like; everybody has a right to complain. But it's not reasonable to expect Amit (or anyone else) to go fix the things you're complaining about in time for v10, or really ever. He doesn't have to write any more partitioning patches ever, and if he does decide to do so, he doesn't have to write the ones you or I or anyone other than his employer wants him to write (and he only has to listen to his employer if he doesn't want to get fired). Also, I'm very much against against any major tinkering with this feature in this release. We've got enough work to do stabilizing what's already been committed in this area, and the last things we need is a bunch of patches that significant change it showing up at the eleventh hour without time for adequate reflection and discussion. Most if not all significant patches for this release should already have been submitted; again, the last CommitFest will be starting shortly, and we should have seen those patches in the previous CommitFest. We should be focusing on getting all the good patches that have already been written committed, not creating new ones at the last minute. Contrary to what you may think, neither changing the way partition pruning works nor inventing a system for indexes to roll down to partition children is a minor fix. Even if you restrict the scope to simple cases, there's still got to be a level of design agreement so that we know we're not boxing ourselves into a corner for the future, and the patch quality still has to be good. That's not going to happen in the next couple of days, barring a dramatic reversal of how the development process in this community has always worked before. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 24 February 2017 at 02:36, Robert Haas <robertmhaas@gmail.com> wrote: >> While its true that the patch had syntax documentation, there was no >> user design documentation which explained how it worked to allow >> objective review. Had I been able to provide input without reading >> every email message, I would have done so earlier. > > But I don't agree that it was impossible for you to provide input > earlier without reading every email message, nor do I agree that it is > unreasonable to expect to people who want to provide input to read the > relevant threads. >> The features I consider very important in the first release are >> 1. Declarative syntax (we have this!) >> 2. Tuple routing on INSERT/COPY (we have this!) >> 3. O(1) partition elimination for simple = queries >> 4. PRIMARY KEY defined using a) distinct set of UNIQUE constraints on >> each partition, b) partition key >> >> 2 and 3 are intimately connected because they would both use the same >> in-memory data for bsearch, so the code should be almost identical. >> >> 4 is important for Foreign Keys and Logical Replication >> >> As missing items, features 3 and 4 seem achievable in this release, >> potentially in restricted form. > > Simon, this is ridiculous. We're 4 or 5 days away from the start of > the last CommitFest. We have time to fix bugs and improve > documentation and maybe tweak things here and there, but 3 and 4 are > significant development projects. There isn't time to do that stuff > right now and get it right. Agreed, you beat me to it. I've spent the last few hours trying to see what I could personally pull out of the fire to assist. Those things aren't that big, but they are bigger than we have time for now. Had I known about that a few months back... well, we are where we are. The good news is that logical replication DOES work with partitioning, but only for a Publication with PUBLISH INSERT, pushing from a normal table to a partitioned one. Which is useful enough for first release. The work on having UPDATE work between partitions can be used to make updates and deletes work in logical replication. That might yet be made to work in this release, and I think we should look into it, but I think it can be left til next release if we try. > You don't get to show up more than two > months after the feature is committed and start complaining that it > doesn't include all the things you want. Which things ought to be > included in the initial patch was under active discussion about a year > ago this time. I've provided lots of input across many years, for example my explanation of how to do tuple routing is very clearly the basis of the design of the current feature. You *knew* I would have feedback and if it it hadn't appeared yet you knew it would come. The "two months after the feature is committed" is a direct result of the lack of docs. Note that within hours of reading the docs I'd given feedback. How could I give feedback earlier? Well, I tried. I've flown to Japan to ensure I could talk to Amit in person about this feature. Your absence at two consecutive developer meetings hasn't helped there. It is you that needs to show up more. I'm sure we both have family and work reasons not to attend them. The need for design feedback is exactly why the docs for logical replication were published in June, six months before logical replication was committed. With repect, you are making a few mistakes. The first is to imagine that review comments are negative or complaints; with the right viewpoint they could easily be seen as helping people to see what has been missed, yet you repeatedly see them as personal attacks and throw words back. Oh sure, I've done that myself in earlier days. The second is to imagine that discussing things on Hackers in multiple threads, spanning many months with long, detailed emails and rapid replies is something that anybody could have followed if they wished. And the third is to imagine I have no right to review; I will watch and see if you deploy this same "You don't get to show up.." argument against Tom or Noah when they point out holes in late reviews, though we already both know you won't. I see you using that yourself, objecting frequently against patches large and small if they do not meet your exacting standards, yet you have spoken multiple times against my right to do that. Perhaps we'll have time to scowl at each other in India. I'll look forward to that. ;-) -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Feb 24, 2017 at 9:53 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > The good news is that logical replication DOES work with partitioning, > but only for a Publication with PUBLISH INSERT, pushing from a normal > table to a partitioned one. Which is useful enough for first release. > > The work on having UPDATE work between partitions can be used to make > updates and deletes work in logical replication. That might yet be > made to work in this release, and I think we should look into it, but > I think it can be left til next release if we try. Are you speaking of the case where you want to replicate from an unpartitioned table to a partitioned table? I would imagine that if you are replicating between two identical partitioning hierarchies, it would just work. (If not, that's probably a bug, though I don't know whose bug.) >> You don't get to show up more than two >> months after the feature is committed and start complaining that it >> doesn't include all the things you want. Which things ought to be >> included in the initial patch was under active discussion about a year >> ago this time. > > I've provided lots of input across many years, for example my > explanation of how to do tuple routing is very clearly the basis of > the design of the current feature. You *knew* I would have feedback > and if it it hadn't appeared yet you knew it would come. Not really. I can't always predict who will take an interest in a patch. I'm not surprised that you have feedback on it, and if that feedback were phrased as "let's try to do X, Y, and Z in future releases" I'd have no problem with it. What I'm reacting against is really the idea that any of this should be done in v10 (and also the idea, which may be an overly defensive reading of your emails, that my original commit was somehow not up to the mark). > The "two months after the feature is committed" is a direct result of > the lack of docs. Note that within hours of reading the docs I'd given > feedback. How could I give feedback earlier? Well, I tried. Quite a few other people managed to give feedback earlier, so it evidently wasn't totally impossible. > I've flown > to Japan to ensure I could talk to Amit in person about this feature. > Your absence at two consecutive developer meetings hasn't helped > there. It is you that needs to show up more. I'm sure we both have > family and work reasons not to attend them. We used to have one developer meeting a year and I have attended it every year I was invited. Then it went up to two per year and I kept attending one of them per year. Now it seems to be three per year and I plan to keep attending one of them per year, unless one of the others happens to be scheduled together with an event I'm planning to attend anyway. As you say, for both family and work reasons, it's hard to commit to doing multiple such events per year. > The need for design feedback is exactly why the docs for logical > replication were published in June, six months before logical > replication was committed. Sure, I think that's great, but there's never been an obligation to write the documentation before the code in the PostgreSQL community. It's not a bad idea and I'm happy if people do it, but I'm not going to start refusing to commit the patches of people who don't do it. Probably 25% of patches have no documentation at all in the first version and that gets added later once some consensus is reached and the feature gets closer to commit; I think that's just fine. For larger features, the documentation often gets expanded after the initial commit; I think that's also fine. Unlike you, I actually don't think it's very hard to follow the discussion about the design of these features in most cases, and it speeds up development if every change in the proposed design doesn't require an adjustment to both the documentation and the code. I think the way logical replication was done was reasonable, and I think the way that partitioning was done was also reasonable. > With repect, you are making a few mistakes. The first is to imagine > that review comments are negative or complaints; with the right > viewpoint they could easily be seen as helping people to see what has > been missed, yet you repeatedly see them as personal attacks and throw > words back. Oh sure, I've done that myself in earlier days. Sure. > The second > is to imagine that discussing things on Hackers in multiple threads, > spanning many months with long, detailed emails and rapid replies is > something that anybody could have followed if they wished. I don't really see how that's a mistake. It might take more time than someone's willing to put in -- I have that problem too, on some threads -- but if someone has the time and is willing to spend it, then they can follow that discussion. > And the > third is to imagine I have no right to review; I will watch and see if > you deploy this same "You don't get to show up.." argument against Tom > or Noah when they point out holes in late reviews, though we already > both know you won't. I see you using that yourself, objecting > frequently against patches large and small if they do not meet your > exacting standards, yet you have spoken multiple times against my > right to do that. I don't think this primarily is about how I react to you vs. how I react to other people, although anybody will have noticed by now that you and I don't agree as often as I agree with some other people, and perhaps I'm being more negative than is deserved. That having been said, Tom and Noah typically complain about bugs in what got committed rather than complaining that the scope of the project was all wrong. I have heard them complain that the scope of the project is all wrong, but before things get committed, not after. What I hear you doing is saying that there are features missing that ought to have been there in the original commit or at least in the same release, and I wouldn't agree with that argument no matter who made it. Nice to have? Yes. Required before 10 goes out the door? Nope. It's not like the scope of this project wasn't extensively discussed long before anything to committed, and the only things we can reasonably do at this release cycle are ship it more or less as-is, with some small fixes here and there, or rip the whole thing out and try again later. I'm pretty convicted that the first option is better, but obviously people can disagree about that. What we can't do is insist on a whole bunch of additional development in the time remaining before feature freeze; the only way that can happen is if we move the whole release timetable out. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Feb 22, 2017 at 07:44:41AM +0530, Robert Haas wrote: > On Tue, Feb 21, 2017 at 2:03 AM, Bruce Momjian <bruce@momjian.us> wrote: > > I have to admit my reaction was similar to Simon's, meaning that the > > lack of docs is a problem, and that the limitations are kind of a > > surprise, and I wonder what other surprises there are. > > Did you read my message upthread pointing out that the initial commit > contained hundreds of lines of documentation? I agree that it would > be bad if table partitioning got committed with no documentation, but > that did not happen. > > > I am thinking this is a result of small teams, often from the same > > company, working on a features in isolation and then making them public. > > It is often not clear what decisions were made and why. > > That also did not happen, or at least certainly not with this patch. > All of the discussion was public and on the mailing list. I never > communicated with Amit Langote off-list about this work, except > shortly before I committed his patches I added him on Skype and gave > him a heads up that I was planning to do so real soon. At no time > have the two of us worked for the same company. Also, the patch had 7 > other reviewers credited in the commit messages spread across, I > think, 4 different companies. I think you are right. I was only guessing on a possible cause of Simon's reaction since I had the same reaction. When traveling, it is hard to get excited about reading a 100+ post thread that has reached a conclusion. I found Simon's summary of the 4 sub-features to be helpful. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On Sun, Feb 26, 2017 at 4:31 AM, Bruce Momjian <bruce@momjian.us> wrote: > I think you are right. I was only guessing on a possible cause of > Simon's reaction since I had the same reaction. When traveling, it is > hard to get excited about reading a 100+ post thread that has reached a > conclusion. I found Simon's summary of the 4 sub-features to be > helpful. OK, no problem. Basically, I think it's a bad plan to redesign this - or add any large amount of incremental change to what's already been done - at this point in the release cycle. Unless we're prepared to rip it all back out, we've got to ship more or less what we have and improve it later. I always viewed the mission of this patch as to set the stage for future improvements in this area, not to solve all of the problems by itself. I'm sorry if anyone was under a contrary impression, and I'm also sorry that the discussion seems to have left some people behind, but I did try my best. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 24/02/17 07:15, Robert Haas wrote: > On Fri, Feb 24, 2017 at 9:53 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> The good news is that logical replication DOES work with partitioning, >> but only for a Publication with PUBLISH INSERT, pushing from a normal >> table to a partitioned one. Which is useful enough for first release. >> >> The work on having UPDATE work between partitions can be used to make >> updates and deletes work in logical replication. That might yet be >> made to work in this release, and I think we should look into it, but >> I think it can be left til next release if we try. > > Are you speaking of the case where you want to replicate from an > unpartitioned table to a partitioned table? I would imagine that if > you are replicating between two identical partitioning hierarchies, it > would just work. (If not, that's probably a bug, though I don't know > whose bug.) > Yes same hierarchies will work but mainly because one has to add partitions themselves to publication currently. I guess that's the limitation we might have to live with in 10 as adding the whole partitioned table should probably work for different hierarchies when we enable it and I am not quite sure that's doable before start of the CF at this point. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2017/02/27 3:18, Petr Jelinek wrote: > On 24/02/17 07:15, Robert Haas wrote: >> On Fri, Feb 24, 2017 at 9:53 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> The good news is that logical replication DOES work with partitioning, >>> but only for a Publication with PUBLISH INSERT, pushing from a normal >>> table to a partitioned one. Which is useful enough for first release. >>> >>> The work on having UPDATE work between partitions can be used to make >>> updates and deletes work in logical replication. That might yet be >>> made to work in this release, and I think we should look into it, but >>> I think it can be left til next release if we try. >> >> Are you speaking of the case where you want to replicate from an >> unpartitioned table to a partitioned table? I would imagine that if >> you are replicating between two identical partitioning hierarchies, it >> would just work. (If not, that's probably a bug, though I don't know >> whose bug.) >> > > Yes same hierarchies will work but mainly because one has to add > partitions themselves to publication currently. I guess that's the > limitation we might have to live with in 10 as adding the whole > partitioned table should probably work for different hierarchies when we > enable it and I am not quite sure that's doable before start of the CF > at this point. If and when we add support to add partitioned tables to publications, I think it will work by recursing to include the partitions to the same publication (I see that OpenTableList() in publicationcmds.c calls find_all_inheritors if recursion is requested by not specifying ONLY). When a subscription replicates from this publication, it's going to match changes for individual leaf partitions, not the root parent table. IOW, none of the changes applied to a partitioned table are replicated as changes to the table itself. So, it seems that adding a partitioned table to a publication or subscription would simply be a *shorthand* for adding all the (leaf) partitions that will actually emit and receive changes. I'm not sure but should adding/removing partitions after the fact cause their addition/removal from the publication (& subscription)? Maybe we'll discuss these issues later. By the way, we currently get the following error due to the relkind check in check_publication_add_relation(): create publication p_pub for table p; ERROR: "p" is not a table DETAIL: Only tables can be added to publications. Would it be better to produce an error message that explicitly states that adding partitioned tables to publications is not supported. Something like: create publication p_pub for table p; ERROR: table "p" cannot be replicated DETAIL: Adding partitioned tables to publications is not supported. Thanks, Amit
On 2017/02/15 12:00, Robert Haas wrote: > On Fri, Feb 10, 2017 at 3:00 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > >> Without claiming I'm happy about this, I think the best way to improve >> the number of eyeballs on this is to commit these docs as is. >> >> For me, the most important thing is understanding the feature, not >> (yet) discussing what the docs should look like. This is especially >> true if other patches reference the way partitioning works and nobody >> can comment on those patches because they don't understand >> >> Any issues with that? > > There are a number of things that I think are awkward about the patch > as committed: > > + <listitem> > + <para> > + See last section for some general information: > + <xref linkend="ddl-partitioned-tables"> > + </para> > + </listitem> > > I think we generally try to write the documentation in such a way as > to minimize backward references, and a backward reference to the > previous section seems particularly odd. We've now got section "5.10 > Partitioned Tables" followed immediately by section "5.11 > Partitioning", where the latter seems to think that you haven't read > the former. > > I think that section 5.11 needs a much heavier rewrite than what it > got as part of this patch. It's a hodgepodge of the old content > (which explained how to fake partitioning when we didn't have an > explicit concept of partitioning) and new content that talks about how > the new way is different from the old way. But now that we have the > new way, I'm guessing that most people are going to use that and not > care about the old way any more. I'm not that it's even appropriate > to keep the lengthy explanation of how to fake table partitioning > using table inheritance and non-overlapping CHECK constraints, but if > we still want that stuff it should be de-emphasized more than it is > here. Probably the section should be retitled: in previous releases > we called this "Partitioning" because we had no explicit notion of > partitioning, but now that we do, it's confusing to have a section > called "Partitioning" that explains how to avoid using the > partitioning feature, which is more or less what this does. Or maybe > the section title should stay the same (or this should be merged into > the previous section?) but rewritten to change the emphasis. I agree that my patch failed to de-emphasize the old partitioning method enough. The examples in 5.11 Partitioning chapter also did not highlight the new partitioning feature as much as it should have been, so it indeed reads like a description of how to avoid using the new partitioning feature. Should we completely remove details about the older partitioning methods? I like the idea of merging what are now two chapters into one and call it Partitioned Tables, retaining the text that describes concepts while getting rid of the texts detailing inheritance implementation. Perhaps with the following sub-sections: 5.10 Partitioned Tables 5.10.1 When To Use Partitioning? (what is now 5.11.1 Overview) 5.10.2 Example (what is now 5.11.2. Implementing Partitioning) 5.10.3 Managing Partitions (same title as 5.11.3) 5.10.4. Partitioning and Constraint Exclusion (same title as 5.11.4) About this, do we want to emphasize the fact that the new partitioned tables *currently* depend on constraint exclusion? I guess not. So the sub-section is retained most as is, with some tweaks. 5.10.5 Caveats (that still are) As mentioned above, the above sub-sections will retain the old text that talks about concepts and not the particular implementation details. For example, in 5.10.3 Managing Partitions, the following text will be retained: <sect2 id="ddl-partitioning-managing-partitions"> <title>Managing Partitions</title> <para> Normally the set of partitions established when initially defining the table are not intended to remain static.It is common to want to remove old partitions of data and periodically add new partitions for new data. Oneof the most important advantages of partitioning is precisely that it allows this otherwise painful task to be executednearly instantaneously by manipulating the partition structure, rather than physically moving large amountsof data around. </para> I will go make a patch for this if there are no objections. Suggestions are welcome. Thanks, Amit
On 27 February 2017 at 10:12, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > I agree that my patch failed to de-emphasize the old partitioning method > enough. The examples in 5.11 Partitioning chapter also did not highlight > the new partitioning feature as much as it should have been, so it indeed > reads like a description of how to avoid using the new partitioning > feature. Your patch was incredibly useful; I just wanted it earlier. I think we're probably all agreed that we should highlight benefits of the new approach more, though the list of caveats should stay somewhere, just as we did for the original inheritance feature, and other things such as replication. > Should we completely remove details about the older partitioning > methods? No, because there is much code out there using it for last 12 years that needs to be explained and there are still some use cases where it is useful that aren't on the roadmap for partitioning. > I like the idea of merging what are now two chapters into one and call it > Partitioned Tables, retaining the text that describes concepts +1 ...but how? 5.10 Partitioned Tables and Related Solutions 5.10.1 Declarative Partitioning (this new feature) 5.10.2 Managing Partitions using Inheritance 5.10.3 Managing Partitions using Union All Views 5.10.4 Accessing tables using BRIN indexes So first and foremost we highlight the new feature and explain all its strengths with examples. We then explain the other possible ways of implementing something similar. This allows us to explain how to handle cases such as when partitions have different set of columns etc.. I'm happy to put my name down to write the sections on Union All Views, which is useful but only mentioned in passing, and the section on BRIN indexes, all of which would have their own independent sets of caveats. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 27/02/17 07:59, Amit Langote wrote: > On 2017/02/27 3:18, Petr Jelinek wrote: >> On 24/02/17 07:15, Robert Haas wrote: >>> On Fri, Feb 24, 2017 at 9:53 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >>>> The good news is that logical replication DOES work with partitioning, >>>> but only for a Publication with PUBLISH INSERT, pushing from a normal >>>> table to a partitioned one. Which is useful enough for first release. >>>> >>>> The work on having UPDATE work between partitions can be used to make >>>> updates and deletes work in logical replication. That might yet be >>>> made to work in this release, and I think we should look into it, but >>>> I think it can be left til next release if we try. >>> >>> Are you speaking of the case where you want to replicate from an >>> unpartitioned table to a partitioned table? I would imagine that if >>> you are replicating between two identical partitioning hierarchies, it >>> would just work. (If not, that's probably a bug, though I don't know >>> whose bug.) >>> >> >> Yes same hierarchies will work but mainly because one has to add >> partitions themselves to publication currently. I guess that's the >> limitation we might have to live with in 10 as adding the whole >> partitioned table should probably work for different hierarchies when we >> enable it and I am not quite sure that's doable before start of the CF >> at this point. > > If and when we add support to add partitioned tables to publications, I > think it will work by recursing to include the partitions to the same > publication (I see that OpenTableList() in publicationcmds.c calls > find_all_inheritors if recursion is requested by not specifying ONLY). > When a subscription replicates from this publication, it's going to match > changes for individual leaf partitions, not the root parent table. IOW, > none of the changes applied to a partitioned table are replicated as > changes to the table itself. So, it seems that adding a partitioned table > to a publication or subscription would simply be a *shorthand* for adding > all the (leaf) partitions that will actually emit and receive changes. This was my first thought as well. However we need to also take into account the use-case with different partitioning topology on publisher and subscriber (at least in a sense that the initial design will not paint us in the corner). Now I see two ways of achieving this. a) Either going with more or less what you wrote above and in the future have some smarts where we can specify that it should replicate with different name. We'll eventually want to be able to replicate table to differently named table anyway so it's not stretch by any means to do that. b) Or just replicate changes to leaf partitions as changes to partitioned table. That's also not that hard to do, we just need fast lookup of partitioned table from leaf table. I guess a) looks simpler given that we eventually need the rename anyway, but I'd like opinions of other people as well. > I'm not sure but should adding/removing partitions after the fact cause > their addition/removal from the publication (& subscription)? Maybe we'll > discuss these issues later. That's something I've been also wondering as there is many corner cases here. For example if table is in some publication and then is added as partition to partitioned table what should happen? What should happen when the partitioned table is then removed from same publication to which partition was added explicitly? Should we allow different publication configuration for different partitions within same partitioned table, etc? This somewhat brings bigger question about where we want to go in general with partitioning. And that is, should partition be a separate entity that is on the same level of table, or should it be part of the partitioned table without it's own "identity". -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2017/02/27 20:44, Simon Riggs wrote: > On 27 February 2017 at 10:12, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: > >> I agree that my patch failed to de-emphasize the old partitioning method >> enough. The examples in 5.11 Partitioning chapter also did not highlight >> the new partitioning feature as much as it should have been, so it indeed >> reads like a description of how to avoid using the new partitioning >> feature. > > Your patch was incredibly useful; I just wanted it earlier. Thanks. Anyway, we agree that there is room for improvement. > I think we're probably all agreed that we should highlight benefits of > the new approach more, though the list of caveats should stay > somewhere, just as we did for the original inheritance feature, and > other things such as replication. Yes. >> Should we completely remove details about the older partitioning >> methods? > > No, because there is much code out there using it for last 12 years > that needs to be explained and there are still some use cases where it > is useful that aren't on the roadmap for partitioning. Sure, it's a matter of where we place it in the rewritten section about partitioning. >> I like the idea of merging what are now two chapters into one and call it >> Partitioned Tables, retaining the text that describes concepts > > +1 > > ...but how? > > 5.10 Partitioned Tables and Related Solutions Presumably, this is where we put the "why" of using partitioning as part of database design, that is, some of the text at the beginning of the 5.11 Partitioning section. > 5.10.1 Declarative Partitioning (this new feature) > 5.10.2 Managing Partitions using Inheritance > 5.10.3 Managing Partitions using Union All Views > 5.10.4 Accessing tables using BRIN indexes Each of these sub-sections explain the "how", using that method. We point out here when it's better to use one method over another, limitations when using a particular method, etc. > So first and foremost we highlight the new feature and explain all its > strengths with examples. > > We then explain the other possible ways of implementing something > similar. This allows us to explain how to handle cases such as when > partitions have different set of columns etc.. Yeah, we don't allow that with declarative partitioned tables. A user reading this section should be able to choose the best solution for their needs. > I'm happy to put my name down to write the sections on Union All > Views, which is useful but only mentioned in passing, and the section > on BRIN indexes, all of which would have their own independent sets of > caveats. OK. So, I will start writing the patch with above general skeleton and hopefully post it within this week and you can improve it as fit. Thanks, Amit
On 28 February 2017 at 08:14, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > OK. So, I will start writing the patch with above general skeleton and > hopefully post it within this week and you can improve it as fit. Will do, thanks. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Feb 27, 2017 at 5:14 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> I like the idea of merging what are now two chapters into one and call it >> Partitioned Tables, retaining the text that describes concepts > > +1 > > ...but how? > > 5.10 Partitioned Tables and Related Solutions > 5.10.1 Declarative Partitioning (this new feature) > 5.10.2 Managing Partitions using Inheritance > 5.10.3 Managing Partitions using Union All Views > 5.10.4 Accessing tables using BRIN indexes > > So first and foremost we highlight the new feature and explain all its > strengths with examples. > > We then explain the other possible ways of implementing something > similar. This allows us to explain how to handle cases such as when > partitions have different set of columns etc.. > > I'm happy to put my name down to write the sections on Union All > Views, which is useful but only mentioned in passing, and the section > on BRIN indexes, all of which would have their own independent sets of > caveats. I like the proposed 5.10.1 and 5.10.2 organization. I am not sure whether 5.10.3 and 5.10.4 make sense because I can't quite imagine what content would go there. We don't document UNION ALL views as a method today, and I'm not sure we really need to start. Also I don't see what BRIN indexes have to do with partitioning. But I may be missing something. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2017/02/28 17:25, Simon Riggs wrote: > On 28 February 2017 at 08:14, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: > >> OK. So, I will start writing the patch with above general skeleton and >> hopefully post it within this week and you can improve it as fit. > > Will do, thanks. Attached patch 0001 is what I managed so far. Please take a look and let me know if there is more I can do. I guess you might want to expand the parts related to UNION ALL views and BRIN indexes. Also for consideration, 0002: some cosmetic fixes to create_table.sgml 0003: add clarification about NOT NULL constraint on partition columns in alter_table.sgml Thoughts? Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
I think you might have the titles for 0002 and 0003 backwards. On Fri, Mar 3, 2017 at 2:51 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > 0002: some cosmetic fixes to create_table.sgml I think this sentence may be unclear to some readers: + One might however want to set it for only some partitions, + which is possible by doing <literal>SET NOT NULL</literal> on individual + partitions. I think you could replace this with something like: Even if there is no <literal>NOT NULL</> constraint on the parent, such a constraint can still be added to individual partitions, if desired; that is, the children can disallow nulls even if the parent allows them, but not the other way around. > 0003: add clarification about NOT NULL constraint on partition columns in > alter_table.sgml This is about list-ifying a note, but I think we should try to de-note-ify it. It's a giant block of text that is not obviously more noteworthy than the surrounding text; I think <note> should be saved for things that particularly need to be called out. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > This is about list-ifying a note, but I think we should try to > de-note-ify it. It's a giant block of text that is not obviously more > noteworthy than the surrounding text; I think <note> should be saved > for things that particularly need to be called out. Yeah. A big problem with the markup we use, imo, is that <note> segments are displayed in a way that makes them more prominent than the surrounding text, not less so. That doesn't really square with my intuitive view of what a <note> ought to be used for; it forces it to be considered as something only slightly less dangerous than a <caution> or <warning>, not as a parenthetical remark. But that's what we have to deal with so we should mark up our text accordingly. regards, tom lane
On 2017/03/10 3:26, Robert Haas wrote: > I think you might have the titles for 0002 and 0003 backwards. Oops, you're right. > On Fri, Mar 3, 2017 at 2:51 AM, Amit Langote wrote: >> 0002: some cosmetic fixes to create_table.sgml > > I think this sentence may be unclear to some readers: > > + One might however want to set it for only some partitions, > + which is possible by doing <literal>SET NOT NULL</literal> on individual > + partitions. > > I think you could replace this with something like: Even if there is > no <literal>NOT NULL</> constraint on the parent, such a constraint > can still be added to individual partitions, if desired; that is, the > children can disallow nulls even if the parent allows them, but not > the other way around. Reads much better, done that way. Thanks. >> 0003: add clarification about NOT NULL constraint on partition columns in >> alter_table.sgml > > This is about list-ifying a note, but I think we should try to > de-note-ify it. It's a giant block of text that is not obviously more > noteworthy than the surrounding text; I think <note> should be saved > for things that particularly need to be called out. OK. The patch is now just about de-note-ifying the block of text. Since I don't see any other lists in the Parameters portion of the page, I also take back my list-ifying proposal. Attached updated patches. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Thu, Mar 9, 2017 at 8:23 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > On 2017/03/10 3:26, Robert Haas wrote: >> I think you might have the titles for 0002 and 0003 backwards. > > Oops, you're right. > >> On Fri, Mar 3, 2017 at 2:51 AM, Amit Langote wrote: >>> 0002: some cosmetic fixes to create_table.sgml >> >> I think this sentence may be unclear to some readers: >> >> + One might however want to set it for only some partitions, >> + which is possible by doing <literal>SET NOT NULL</literal> on individual >> + partitions. >> >> I think you could replace this with something like: Even if there is >> no <literal>NOT NULL</> constraint on the parent, such a constraint >> can still be added to individual partitions, if desired; that is, the >> children can disallow nulls even if the parent allows them, but not >> the other way around. > > Reads much better, done that way. Thanks. > >>> 0003: add clarification about NOT NULL constraint on partition columns in >>> alter_table.sgml >> >> This is about list-ifying a note, but I think we should try to >> de-note-ify it. It's a giant block of text that is not obviously more >> noteworthy than the surrounding text; I think <note> should be saved >> for things that particularly need to be called out. > > OK. The patch is now just about de-note-ifying the block of text. Since > I don't see any other lists in the Parameters portion of the page, I also > take back my list-ifying proposal. > > Attached updated patches. Committed 0002, 0003. I think the section on BRIN in 0001 is just silly. BRIN is a very useful index type, possibly more useful than anything except btree, but I think documenting it as an alternative method of partitioning is over the top. + The following forms of partitioning can be implemented in + <productname>PostgreSQL</productname>: Any form of partitioning can be implemented, at least to some degree, using inheritance or UNION ALL views. I think what this should say is that PostgreSQL has native support for list and range partitioning, and then it can go on top say that if this built-in support is not suitable for a particular use case (either because you need some other partitioning scheme or due to some other limitation), inheritance or UNION ALL views can be used instead, adding flexibility but losing some of the performance benefits of built-in declarative partitioning. <para> Partitions may have their own indexes, constraints and default values, - distinct from other partitions. Partitions do not inherit indexes from - the partitioned table. + distinct from other partitions. Partitions do not currently inherit + indexes from the partitioned table. + </para> + + <para> + See <xref linkend="sql-createtable"> for more details creating partitioned + tables and partitions. </para> I don't think we should add "currently"; that amounts to speculation about what will happen in future versions. Also, I favor collapsing these into one paragraph. A single-sentence paragraph tends to look OK when you're reading the SGML directly, but it looks funny in the rendered version. + <firstterm>sub-partitioning</firstterm>. It is not currently possible to + alter a regular table into a partitioned table or vice versa. However, + it is possible to add a regular or partitioned table containing data into + a partition of a partitioned table, or remove a partition; see I think we should say "as a partition" rather than "into a partition", assuming you're talking about ATTACH PARTITION here. - partitioned tables. For example, specifying <literal>ONLY</literal> - when querying data from a partitioned table would not make much sense, - because all the data is contained in partitions, so this raises an - error. Specifying <literal>ONLY</literal> when modifying schema is - not desirable in certain cases with partitioned tables where it may be - fine for regular inheritance parents (for example, dropping a column - from only the parent); an error will be thrown in that case. + partitioned tables. Specifying <literal>ONLY</literal> when modifying + schema is not desirable in certain cases with partitioned tables + whereas it may be fine for regular inheritance parents (for example, + dropping a column from only the parent); an error will be thrown in + that case. I don't see why this is an improvement. - data inserted into the partitioned table cannot be routed to foreign table - partitions. + data inserted into the partitioned table is currently not routed to foreign + table partitions. Again, let's not speculate about the future. + Note that it is currently not supported to propagate index definition + from the master partitioned table to its partitions; in fact, it is + not possible to define indexes on partitioned tables in the first + place. This might change in future releases. Same here. + There are currently the following limitations of using partitioned tables: And here. Better to write "The following limitations apply to partitioned tables:" + It is currently not possible to define indexes on partitioned tables + that include all rows from all partitions in one global index. + Consequently, it is not possible to create constraints that are realized + using an index such as <literal>UNIQUE</>. This doesn't seem very grammatical, and it kind of overlaps with the previous point, and the following point. How about just adding a sentence to the previous paragraph: This also means that there is no way to create a primary key, unique constraint, or exclusion constraint spanning all partitions; it is only possible to constrain each leaf partition individually. + <command>INSERT</command> statements with <literal>ON CONFLICT</> + clause are currently not allowed on partitioned tables. Obsolete. + implicit partition constraint of the original partition. This might + change in future releases. Remove speculation. + In some cases, one may want to add columns to partitions that are not + present in the parent table which is not possible to do with the above + method. For such cases, partitioning can be implemented using + inheritance (see <xref linkend="ddl-inherit">). Hmm, I bet that's not the only advantage. And it doesn't seem like the way to lead. e.g. While the built-in declarative partitioning is suitable for most common use cases, there are some circumstances where a more flexible approach may be useful. Partitioning can be implemented using table inheritance, which allows for several features which are not supported by declarative partitioning, such as: - Partitioning enforces a rule that all partitions must have exactly the same set of columns as the parent, but table inheritance allows children to have extra columns not present in the parent. - Table inheritance allows for multiple inheritance. - Declarative partitioning only supports list and range partitioning, whereas table inheritance allows data to be divided in a manner of the user's choosing. (Note, however, that if constraint exclusion is unable to prune partitions effectively, query performance will be very poor.) - Some operations require a stronger lock when using declarative partitioning than when using table inheritance. (list these) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2017/03/28 0:23, Robert Haas wrote: > On Thu, Mar 9, 2017 at 8:23 PM, Amit Langote wrote: >> Attached updated patches. > > Committed 0002, 0003. Thanks a lot for committing these and reviewing 0001. > I think the section on BRIN in 0001 is just silly. BRIN is a very > useful index type, possibly more useful than anything except btree, > but I think documenting it as an alternative method of partitioning is > over the top. Okay, removing the BRIN part from this patch for now. > + The following forms of partitioning can be implemented in > + <productname>PostgreSQL</productname>: > > Any form of partitioning can be implemented, at least to some degree, > using inheritance or UNION ALL views. I think what this should say is > that PostgreSQL has native support for list and range partitioning, > and then it can go on top say that if this built-in support is not > suitable for a particular use case (either because you need some other > partitioning scheme or due to some other limitation), inheritance or > UNION ALL views can be used instead, adding flexibility but losing > some of the performance benefits of built-in declarative partitioning. You're right. I've updated the text to sound like what you said here. > <para> > Partitions may have their own indexes, constraints and default values, > - distinct from other partitions. Partitions do not inherit indexes from > - the partitioned table. > + distinct from other partitions. Partitions do not currently inherit > + indexes from the partitioned table. > + </para> > + > + <para> > + See <xref linkend="sql-createtable"> for more details creating partitioned > + tables and partitions. > </para> > > I don't think we should add "currently"; that amounts to speculation > about what will happen in future versions. Also, I favor collapsing > these into one paragraph. A single-sentence paragraph tends to look > OK when you're reading the SGML directly, but it looks funny in the > rendered version. Done. > > + <firstterm>sub-partitioning</firstterm>. It is not currently possible to > + alter a regular table into a partitioned table or vice versa. However, > + it is possible to add a regular or partitioned table containing data into > + a partition of a partitioned table, or remove a partition; see > > I think we should say "as a partition" rather than "into a partition", > assuming you're talking about ATTACH PARTITION here. Right, fixed. > > - partitioned tables. For example, specifying <literal>ONLY</literal> > - when querying data from a partitioned table would not make much sense, > - because all the data is contained in partitions, so this raises an > - error. Specifying <literal>ONLY</literal> when modifying schema is > - not desirable in certain cases with partitioned tables where it may be > - fine for regular inheritance parents (for example, dropping a column > - from only the parent); an error will be thrown in that case. > + partitioned tables. Specifying <literal>ONLY</literal> when modifying > + schema is not desirable in certain cases with partitioned tables > + whereas it may be fine for regular inheritance parents (for example, > + dropping a column from only the parent); an error will be thrown in > + that case. > > I don't see why this is an improvement. Because we neither raise an error nor ignore it if ONLY is specified when querying data from a partitioned table. create table p (a int, b char) partition by list (a); create table p1 partition of p for values in (1); insert into p values (1); select * from only p; a | b ---+--- (0 rows) explain select * from only p; QUERY PLAN ------------------------------------------- Result (cost=0.00..0.00 rows=0 width=12) One-Time Filter: false (2 rows) IOW, querying behavior is same as regular inheritance. I rewrote the paragraph as follows: <para> The <literal>ONLY</literal> notation used to exclude child tables will cause an error for partitioned tables in the case of schema-modifying commands such as most <literal>ALTER TABLE</literal> commands. For example, dropping a column from only the parent does not make sense for partitioned tables. </para> > - data inserted into the partitioned table cannot be routed to foreign table > - partitions. > + data inserted into the partitioned table is currently not routed to foreign > + table partitions. > > Again, let's not speculate about the future. > > + Note that it is currently not supported to propagate index definition > + from the master partitioned table to its partitions; in fact, it is > + not possible to define indexes on partitioned tables in the first > + place. This might change in future releases. > > Same here. > > + There are currently the following limitations of using partitioned tables: > > And here. Better to write "The following limitations apply to > partitioned tables:" Fixed all of these. > + It is currently not possible to define indexes on partitioned tables > + that include all rows from all partitions in one global index. > + Consequently, it is not possible to create constraints that are realized > + using an index such as <literal>UNIQUE</>. > > This doesn't seem very grammatical, and it kind of overlaps with the > previous point, and the following point. How about just adding a > sentence to the previous paragraph: This also means that there is no > way to create a primary key, unique constraint, or exclusion > constraint spanning all partitions; it is only possible to constrain > each leaf partition individually. OK, done. > + <command>INSERT</command> statements with <literal>ON CONFLICT</> > + clause are currently not allowed on partitioned tables. > > Obsolete. Text from the patch you just committed now replaces this item. > + implicit partition constraint of the original partition. This might > + change in future releases. > > Remove speculation. Done. Also, a few other "currently"s I had added. > + In some cases, one may want to add columns to partitions that are not > + present in the parent table which is not possible to do with the above > + method. For such cases, partitioning can be implemented using > + inheritance (see <xref linkend="ddl-inherit">). > > Hmm, I bet that's not the only advantage. And it doesn't seem like > the way to lead. > > e.g. > > While the built-in declarative partitioning is suitable for most > common use cases, there are some circumstances where a more flexible > approach may be useful. Partitioning can be implemented using table > inheritance, which allows for several features which are not supported > by declarative partitioning, such as: > > - Partitioning enforces a rule that all partitions must have exactly > the same set of columns as the parent, but table inheritance allows > children to have extra columns not present in the parent. > > - Table inheritance allows for multiple inheritance. > > - Declarative partitioning only supports list and range partitioning, > whereas table inheritance allows data to be divided in a manner of the > user's choosing. (Note, however, that if constraint exclusion is > unable to prune partitions effectively, query performance will be very > poor.) > > - Some operations require a stronger lock when using declarative > partitioning than when using table inheritance. (list these) Thanks, that's a lot better. Attached updated patch. Regards, Amit
On Tue, Mar 28, 2017 at 6:35 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > Attached updated patch. Committed with editing here and there. I just left out the thing about UNION ALL views, which seemed to brief a treatment to deserve its own subsection. Maybe a longer explanation of that is worthwhile or maybe it's not, but that can be a separate patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2017/04/01 6:37, Robert Haas wrote: > On Tue, Mar 28, 2017 at 6:35 AM, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> Attached updated patch. > > Committed with editing here and there. I just left out the thing > about UNION ALL views, which seemed to brief a treatment to deserve > its own subsection. Maybe a longer explanation of that is worthwhile > or maybe it's not, but that can be a separate patch. Thanks for committing. I noticed what looks like a redundant line (or an incomplete sentence) in the paragraph about multi-column partition keys. In the following text: + partitioning, if desired. Of course, this will often result in a larger + number of partitions, each of which is individually smaller. + criteria. Using fewer columns may lead to coarser-grained + A query accessing the partitioned table will have + to scan fewer partitions if the conditions involve some or all of these This: + criteria. Using fewer columns may lead to coarser-grained looks redundant. But maybe we can keep that sentence by completing it, which the attached patch does as follows: + number of partitions, each of which is individually smaller. On the + other hand, using fewer columns may lead to a coarser-grained + partitioning criteria with smaller number of partitions. The patch also fixes some typos I noticed. Thanks, Amit
Attachment
On Mon, Apr 3, 2017 at 12:52 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > I noticed what looks like a redundant line (or an incomplete sentence) in > the paragraph about multi-column partition keys. In the following text: > > + partitioning, if desired. Of course, this will often result in a > larger > + number of partitions, each of which is individually smaller. > + criteria. Using fewer columns may lead to coarser-grained > + A query accessing the partitioned table will have > + to scan fewer partitions if the conditions involve some or all of > these > > This: > > + criteria. Using fewer columns may lead to coarser-grained > > looks redundant. But maybe we can keep that sentence by completing it, > which the attached patch does as follows: > > + number of partitions, each of which is individually smaller. On the > + other hand, using fewer columns may lead to a coarser-grained > + partitioning criteria with smaller number of partitions. > > The patch also fixes some typos I noticed. Thanks for the corrections. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company