Thread: Restriction on table partition expressions
Hey PG community, PG version: 13 Platform: Linux I was wondering if anyone understands why there is the restriction/limitation on using expressions (on a primary/unique key) as part of a table partition definition. E.g. CREATE TABLE foobar( id BIGINT NOT NULL PRIMARY KEY, baz VARCHAR NULL DEFAULT NULL ) PARTITION BY HASH(my_func(id)); Error: primary key constraints cannot be used when partition keys include expressions. I have a case where using either hash or list partitioning schemes, it's handy to use my_func() on the chosen field (look up some accompanying value in another table, for example) but I cannot because it violates a limitation imposed by PG. Yet, I very much want 'id' to be my primary key! What's more, to retain referential integrity I want to keep 'id' as a primary key because in some other relation I define a foreign key by it. I couldn't find much, if anything, about using expressions in table partitions let alone describing the restriction. Can anyone enlighten me? Or point me to what I've missed! Also, is there a chance that this limitation will be relaxed in the future? Here's an example on dbfiddle; https://dbfiddle.uk/?rdbms=postgres_13&fiddle=cb498f4e3c6b06e1b61bb0b7e57747e6 Note the use of ON CONFLICT is key to our code and so attempting any dynamic creations of individual child partitions, indices etc. also fail to work properly since the unique id constraint isn't known to the parent table. Cheers Jim -- Jim Vanns Principal Production Engineer Industrial Light & Magic, London
On Fri, 26 Aug 2022 at 03:08, James Vanns <jvanns@ilm.com> wrote: > CREATE TABLE foobar( > id BIGINT NOT NULL PRIMARY KEY, > baz VARCHAR NULL DEFAULT NULL > ) PARTITION BY HASH(my_func(id)); > > Error: primary key constraints cannot be used when partition keys > include expressions. > I couldn't find much, if anything, about using expressions in table > partitions let alone describing the restriction. Can anyone enlighten > me? Or point me to what I've missed! Also, is there a chance that this > limitation will be relaxed in the future? This is mentioned in [1]: "Unique constraints (and hence primary keys) on partitioned tables must include all the partition key columns. This limitation exists because the individual indexes making up the constraint can only directly enforce uniqueness within their own partitions; therefore, the partition structure itself must guarantee that there are not duplicates in different partitions." Maybe that could be more clear and mention that all the primary key columns must be present and not be part of a function call or expression. David [1] https://www.postgresql.org/docs/13/ddl-partitioning.html
On Fri, 26 Aug 2022 at 03:08, James Vanns <jvanns@ilm.com> wrote: > Also, is there a chance that this > limitation will be relaxed in the future? (forgot to answer this part) Certainly not in the near future, I'm afraid. It would require allowing a single index to exist over multiple tables. There has been discussions about this in the past and the general thoughts are that if you have a single index over all partitions, then it massively detracts from the advantages of partitioning. With partitioning, you can DETACH or DROP a partition and get rid of all the data quickly in a single metadata operation. If you have an index over all partitions then that operation is no longer a metadata-only operation. It suddenly needs to go and remove or invalidate all records pointing to the partition you want to detach/drop. David
Thanks for that, David. It makes sense and no, it certainly wouldn't do to have a global index across all the partitions! It sounds like the key thing that needs highlighting is if the result of an expression (function call in this case) cannot guarantee the uniqueness of the value across all partitions, then that is why it's forbidden. Cheers Jim On Thu, 25 Aug 2022 at 16:32, David Rowley <dgrowleyml@gmail.com> wrote: > > On Fri, 26 Aug 2022 at 03:08, James Vanns <jvanns@ilm.com> wrote: > > Also, is there a chance that this > > limitation will be relaxed in the future? > > (forgot to answer this part) > > Certainly not in the near future, I'm afraid. It would require > allowing a single index to exist over multiple tables. There has been > discussions about this in the past and the general thoughts are that > if you have a single index over all partitions, then it massively > detracts from the advantages of partitioning. With partitioning, you > can DETACH or DROP a partition and get rid of all the data quickly in > a single metadata operation. If you have an index over all partitions > then that operation is no longer a metadata-only operation. It > suddenly needs to go and remove or invalidate all records pointing to > the partition you want to detach/drop. > > David -- Jim Vanns Principal Production Engineer Industrial Light & Magic, London
It certainly does make sense to have global indices on partitioned tables. Rdd/VMS had them 20+ years ago, and they are (I still have two production systems using Rdb on OpenVMS) darned useful.
Did it require dropping the index before dropping a partition? Absolutely!! But of course every RDBMS has limitations. You accept and work around them, or migrate to a different RDBMS.
Did it require dropping the index before dropping a partition? Absolutely!! But of course every RDBMS has limitations. You accept and work around them, or migrate to a different RDBMS.
On 8/26/22 03:50, James Vanns wrote:
Thanks for that, David. It makes sense and no, it certainly wouldn't do to have a global index across all the partitions! It sounds like the key thing that needs highlighting is if the result of an expression (function call in this case) cannot guarantee the uniqueness of the value across all partitions, then that is why it's forbidden. Cheers Jim On Thu, 25 Aug 2022 at 16:32, David Rowley <dgrowleyml@gmail.com> wrote:On Fri, 26 Aug 2022 at 03:08, James Vanns <jvanns@ilm.com> wrote:Also, is there a chance that this limitation will be relaxed in the future?(forgot to answer this part) Certainly not in the near future, I'm afraid. It would require allowing a single index to exist over multiple tables. There has been discussions about this in the past and the general thoughts are that if you have a single index over all partitions, then it massively detracts from the advantages of partitioning. With partitioning, you can DETACH or DROP a partition and get rid of all the data quickly in a single metadata operation. If you have an index over all partitions then that operation is no longer a metadata-only operation. It suddenly needs to go and remove or invalidate all records pointing to the partition you want to detach/drop. David
--
Angular momentum makes the world go 'round.
Angular momentum makes the world go 'round.