Thread: Partitioning and unique key

Partitioning and unique key

From
veem v
Date:

Hello, 

We have our transaction tables daily range partitioned based on transaction_timestamp column which is timestamptz data type and these are having composite primary key on (transaction_id, transaction_timestamp). And we were using an "insert on conflict" for loading data to our system , which means if another record comes to the system with the same transaction_id and transaction_timestamp, it will get updated. This way we already have 60 days worth of data stored in our system with approx. 70 million transactions per day.

But we just got to know from business that the data should be unique by only transaction_id but not transaction_timestamp. Any incoming data with the same transaction_id(even different transaction_timestamp) should get updated but not inserted.

Also these daily partitions are going to hold 400million rows in future and will be queried on the transaction_timestamp filter so we can't really avoid the partitioning option here considering future growth.

But due to postgres limitations we are unable to have this unique constraint or primary key only on the transaction_id column, we have to include transaction_timestamp with it as a composite key. So I want to understand from experts if there is any possible way to satisfy both partitioning on transaction_timestamp column and unique key or pk just on transaction_id only? 

Note-its 15.4 postgres database. 


Regards

Veem

Re: Partitioning and unique key

From
Adrian Klaver
Date:
On 8/31/24 13:02, veem v wrote:
> Hello,
> 
> We have our transaction tables daily range partitioned based on 
> transaction_timestamp column which is timestamptz data type and these 
> are having composite primary key on (transaction_id, 
> transaction_timestamp). And we were using an "insert on conflict" for 
> loading data to our system , which means if another record comes to the 
> system with the same transaction_id and transaction_timestamp, it will 
> get updated. This way we already have 60 days worth of data stored in 
> our system with approx. 70 million transactions per day.
> 
> But we just got to know from business thatthe data should be unique by 
> only transaction_id but not transaction_timestamp. Any incoming data 
> with the same transaction_id(even different transaction_timestamp) 
> should get updated but not inserted.
> 
> Also these daily partitions are going to hold 400million rows in future 
> and will be queried on the transaction_timestamp filter so we can't 
> really avoid the partitioning option here considering future growth.
> 
> But due to postgres limitations we are unable to have this unique 
> constraint or primary key only on the transaction_id column, we have to 
> include transaction_timestamp with it as a composite key. So I want to 
> understand from experts if there is any possible way to satisfy both 
> partitioning on transaction_timestamp column and unique key or pk just 
> on _trans_action_id only?

The model is at odds with itself and untenable. If the tables hold 
multiple rows for a given transaction_id  then you cannot have a 
PK/Unique constraint on that column. Seems there is a decided lack of 
any planning. The only way I can see this happening is consolidating all 
the duplicate transaction_id rows into a single row for each 
transaction_id. That then leads to the question of how to do that and 
retain the 'correct' information from the selection of rows for each 
transaction_id.

> 
> Note-its 15.4 postgres database.
> 
> 
> Regards
> 
> Veem
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: Partitioning and unique key

From
veem v
Date:

On Sun, 1 Sept 2024 at 03:58, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

The model is at odds with itself and untenable. If the tables hold
multiple rows for a given transaction_id  then you cannot have a
PK/Unique constraint on that column. Seems there is a decided lack of
any planning. The only way I can see this happening is consolidating all
the duplicate transaction_id rows into a single row for each
transaction_id. That then leads to the question of how to do that and
retain the 'correct' information from the selection of rows for each
transaction_id.


Yes we had messed up the data for now and have multiple records for each transaction_id persisted and thus we need to fix the data. But more than that , as I stated , I wanted to understand first 
1) if it's technically possible to have a unique key on only the transaction_id column having the partition key on the transaction_timestamp, because the table is going to be queried/purged based on the transaction_timestamp?

2) Additionally we were thinking if above is technically not possible, then the maximum granularity which we can have for each transaction_id will be a day, so the partition key transaction_timestmp can be truncated to have only date component but no time component. So the primary key will be (transaction_id, transaction_date). But we also don't want to lose the time component and persist the existing data of transaction_timestmp (which will have a time component in it, in a separate column).

And in above case , for fixing the existing data in least disruptive way, as we have currently duplicate transaction_id inserted into the table already because of the composite primary key(transaction_id, transaction_timestmp).Can we simply
 i)rename the existing column transaction_timestmp to transaction_date and then add new column transaction_timestmp using the values of existing column partition by partition. 
ii)And then delete the duplicate data using query something as below , each partition by partition. 
iii)And then alter the datatype of the partition key transaction_date to DATE in one shot at the table level(which should be fast as its having more granularity as compare to existing timestamptype, so should be catalog or dictionary change only), and that will remain the part of composite PK (transaction_id,transaction_date).
iv) Repeat this step for all child partition tables and then for the parent partition tables. 

Will this technique be the most efficient way of fixing this mess? 

WITH ranked_records AS (
            SELECT column1_id, column2_timestamptz,
                   ROW_NUMBER() OVER (PARTITION BY column1_id, date_trunc('day', column2_timestamptz)
                                      ORDER BY column2_timestamptz DESC) AS rn
            FROM partition_name
        )
        DELETE FROM partition_name T1
        WHERE EXISTS (
            SELECT 1
            FROM ranked_records T2
            WHERE T1.column1_id = T2.column1_id
              AND T1.column2_timestamptz = T2.column2_timestamptz
              AND T2.rn > 1
        )

Re: Partitioning and unique key

From
"David G. Johnston"
Date:
On Saturday, August 31, 2024, veem v <veema0000@gmail.com> wrote:

1) if it's technically possible to have a unique key on only the transaction_id column having the partition key on the transaction_timestamp, because the table is going to be queried/purged based on the transaction_timestamp?

There is presently no such thing as a cross-partition unique constraint.  If you define the constraint on the [partitioned] table the documentation is perfectly clear, as are I believe the error messages, that it will require all partitioning columns to be included - since that is what happens in reality.  If you target the partitions directly with the unique index or constraint no such limitation should exist.
 

iii)And then alter the datatype of the partition key transaction_date to DATE in one shot at the table level(which should be fast as its having more granularity as compare to existing timestamptype, so should be catalog or dictionary change only), and that will remain the part of composite PK (transaction_id,transaction_date).

While this might seem logical, in reality date and timestamptz are different fixed-width data types and thus any attempt to change from one to the other will involve a table rewrite.  Best you could do is leave the timestamptz in place and just truncate to day so the time is always midnight UTC.
 
iv) Repeat this step for all child partition tables and then for the parent partition tables.

I’d suggest trying to just build a new partitioned table that is correctly defined. Then populate it.  Add a trigger to the existing one to keep the new one in sync.  Then change your application code to point to the new partitioned table.  At which point the old partitioned table can be dropped.

David J.

Re: Partitioning and unique key

From
veem v
Date:

On Sun, 1 Sept 2024 at 09:13, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Saturday, August 31, 2024, veem v <veema0000@gmail.com> wrote:


iii)And then alter the datatype of the partition key transaction_date to DATE in one shot at the table level(which should be fast as its having more granularity as compare to existing timestamptype, so should be catalog or dictionary change only), and that will remain the part of composite PK (transaction_id,transaction_date).

While this might seem logical, in reality date and timestamptz are different fixed-width data types and thus any attempt to change from one to the other will involve a table rewrite.  Best you could do is leave the timestamptz in place and just truncate to day so the time is always midnight UTC.
 

 
Here  , if we keep the PK column as is i.e. the transaction_timestamp as timestamptz but truncate the time component , in that case again in future if someone tries to insert(using insert on conflict) data into the table with time component , it will get consumed and will not be restricted by the PK constraint. So I was trying to make the data type also as DATE for the transaction_timestap column.

As in this case anyway we have to create another column to populate the date+timestamp values as we cant throw those values away per business need, so we will be kind of rewriting the table.So is it okay if if we will

1) Detach all the partitions.
2)Do the alter using "only" key word in table level. (For adding new column transaction_timestamp_new  to hold date+timestamp value and also altering the existing transaction_timestamp column to DATE from type timestamptz).
3)Then do the data fix(delete the duplicates) and alter the column, one partition at a time for all of the partitions and once done , attach  those partitions one by one.
5)Rename the columns at table level.Hope this won't need any table rewrite.

Is there any downside if we go by the above approach?

Re: Partitioning and unique key

From
veem v
Date:

On Sun, 1 Sept 2024 at 10:03, veem v <veema0000@gmail.com> wrote:

On Sun, 1 Sept 2024 at 09:13, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Saturday, August 31, 2024, veem v <veema0000@gmail.com> wrote:


iii)And then alter the datatype of the partition key transaction_date to DATE in one shot at the table level(which should be fast as its having more granularity as compare to existing timestamptype, so should be catalog or dictionary change only), and that will remain the part of composite PK (transaction_id,transaction_date).

While this might seem logical, in reality date and timestamptz are different fixed-width data types and thus any attempt to change from one to the other will involve a table rewrite.  Best you could do is leave the timestamptz in place and just truncate to day so the time is always midnight UTC.
 

 
Here  , if we keep the PK column as is i.e. the transaction_timestamp as timestamptz but truncate the time component , in that case again in future if someone tries to insert(using insert on conflict) data into the table with time component , it will get consumed and will not be restricted by the PK constraint. So I was trying to make the data type also as DATE for the transaction_timestap column.

As in this case anyway we have to create another column to populate the date+timestamp values as we cant throw those values away per business need, so we will be kind of rewriting the table.So is it okay if if we will

1) Detach all the partitions.
2)Do the alter using "only" key word in table level. (For adding new column transaction_timestamp_new  to hold date+timestamp value and also altering the existing transaction_timestamp column to DATE from type timestamptz).
3)Then do the data fix(delete the duplicates) and alter the column, one partition at a time for all of the partitions and once done , attach  those partitions one by one.
5)Rename the columns at table level.Hope this won't need any table rewrite.

Is there any downside if we go by the above approach?

Or do you mean to say there is no way we can modify the data type of a partition key even by detaching the partitions one by one? And thus we may have only way left is to create the table from scratch with partitions and populate the data to it? I was avoiding this because we have many indexes also in it , so creating from scratch means creating those indexes again. So I wanted to achieve it by detaching partitions, doing the required change and attaching it again.

Re: Partitioning and unique key

From
veem v
Date:


On Sun, 1 Sept 2024 at 11:38, veem v <veema0000@gmail.com> wrote:

On Sun, 1 Sept 2024 at 10:03, veem v <veema0000@gmail.com> wrote:

On Sun, 1 Sept 2024 at 09:13, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Saturday, August 31, 2024, veem v <veema0000@gmail.com> wrote:


iii)And then alter the datatype of the partition key transaction_date to DATE in one shot at the table level(which should be fast as its having more granularity as compare to existing timestamptype, so should be catalog or dictionary change only), and that will remain the part of composite PK (transaction_id,transaction_date).

While this might seem logical, in reality date and timestamptz are different fixed-width data types and thus any attempt to change from one to the other will involve a table rewrite.  Best you could do is leave the timestamptz in place and just truncate to day so the time is always midnight UTC.
 

 
Here  , if we keep the PK column as is i.e. the transaction_timestamp as timestamptz but truncate the time component , in that case again in future if someone tries to insert(using insert on conflict) data into the table with time component , it will get consumed and will not be restricted by the PK constraint. So I was trying to make the data type also as DATE for the transaction_timestap column.

As in this case anyway we have to create another column to populate the date+timestamp values as we cant throw those values away per business need, so we will be kind of rewriting the table.So is it okay if if we will

1) Detach all the partitions.
2)Do the alter using "only" key word in table level. (For adding new column transaction_timestamp_new  to hold date+timestamp value and also altering the existing transaction_timestamp column to DATE from type timestamptz).
3)Then do the data fix(delete the duplicates) and alter the column, one partition at a time for all of the partitions and once done , attach  those partitions one by one.
5)Rename the columns at table level.Hope this won't need any table rewrite.

Is there any downside if we go by the above approach?

Or do you mean to say there is no way we can modify the data type of a partition key even by detaching the partitions one by one? And thus we may have only way left is to create the table from scratch with partitions and populate the data to it? I was avoiding this because we have many indexes also in it , so creating from scratch means creating those indexes again. So I wanted to achieve it by detaching partitions, doing the required change and attaching it again.


I tried  by detaching the partitions, but still then it's not allowing me to alter the DAT TYPE of the partition key and throwing error as below. 

ERROR:  cannot alter column "<patition_key_column>" because it is part of the partition key of relation "<table_name>"

Now I am thinking if it's really going to get too complex if we try to stick with the partition detach and attach strategy. As a few teammates say , having a new column added with just a date type and then drop the existing FK and PK first and then detach all the partitions, and attach the partitions back using the new DATE column. and then recreate the PK again. Btw we have ~5 partition tables with parent child relationship on which this fix has to be applied. So I'm still wondering the best way possible for fixing this issue.

Re: Partitioning and unique key

From
Laurenz Albe
Date:
On Sun, 2024-09-01 at 01:32 +0530, veem v wrote:
> due to postgres limitations we are unable to have this unique constraint or primary key
> only on the transaction_id column, we have to include transaction_timestamp with it as
> a composite key. So I want to understand from experts if there is any possible way to
> satisfy both partitioning on transaction_timestamp column and unique key or pk just on
> transaction_id only? 

No, you cannot have both.

Usually the solution is to *not* create a primary key on the partitioned table
and instead create a primary key on each partition.

That won't guarantee global uniqueness (and there is no way to do that), but it
goes a long way by ensuring that the column is unique within each partition.

Yours,
Laurenz Albe



Re: Partitioning and unique key

From
veem v
Date:
On Mon, 2 Sept 2024 at 19:13, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Sun, 2024-09-01 at 01:32 +0530, veem v wrote:
> due to postgres limitations we are unable to have this unique constraint or primary key
> only on the transaction_id column, we have to include transaction_timestamp with it as
> a composite key. So I want to understand from experts if there is any possible way to
> satisfy both partitioning on transaction_timestamp column and unique key or pk just on
> transaction_id only? 

No, you cannot have both.

Usually the solution is to *not* create a primary key on the partitioned table
and instead create a primary key on each partition.

That won't guarantee global uniqueness (and there is no way to do that), but it
goes a long way by ensuring that the column is unique within each partition.

Yours,
Laurenz Albe

Thank you so much.

So it means in our case the existing PK on table level on column (txn_id and txn_timestamp), we should drop that and create a unique index on each partition level and also the same way the foreign key also maps to the parent table partitions. And in that case , can we follow this as best practices to not have the primary keys defined at the tabe level at all, but on the partition level only, or there exist any down side to it too? Please suggest.

Also then what I see is, it will make the data load query fail which uses "insert on conflict" to insert data into the table and that requires the primary key on both the columns to have on table level. Also the partition maintenance job which uses partman extension uses the template table which in turn uses table level properties for creating new partitions and they will not have these unique indexes created for the new partitions as because the unique index property is not on the table level but partition level. Can you share your thoughts on these?  

Re: Partitioning and unique key

From
Laurenz Albe
Date:
On Mon, 2024-09-02 at 21:39 +0530, veem v wrote:
> On Mon, 2 Sept 2024 at 19:13, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> > On Sun, 2024-09-01 at 01:32 +0530, veem v wrote:
> > > due to postgres limitations we are unable to have this unique constraint or primary key
> > > only on the transaction_id column, we have to include transaction_timestamp with it as
> > > a composite key. So I want to understand from experts if there is any possible way to
> > > satisfy both partitioning on transaction_timestamp column and unique key or pk just on
> > > transaction_id only? 
> >
> > No, you cannot have both.
> >
> > Usually the solution is to *not* create a primary key on the partitioned table
> > and instead create a primary key on each partition.
> >
> > That won't guarantee global uniqueness (and there is no way to do that), but it
> > goes a long way by ensuring that the column is unique within each partition.
>
> So it means in our case the existing PK on table level on column (txn_id and txn_timestamp),
> we should drop that and create a unique index on each partition level and also the same way
> the foreign key also maps to the parent table partitions. And in that case , can we follow
> this as best practices to not have the primary keys defined at the tabe level at all, but
> on the partition level only, or there exist any down side to it too? Please suggest.

You can keep the primary key defined on both columns if it is good enough for you.
But it will give you lower guarantees of uniqueness: with that primary key, there could
be two rows with a different timestamp, but the same "txn_id", and these two rows could
be in the same partition...

Also, if you need a foreign key pointing *to* the partitioned table, you cannot do without
a primary key.  But I recommend that you do *not* define such foreign keys: they will make
it more difficult to detach a partition.

If you partition two tables in the same way, you can use foreign keys between the partitions
instead of foreign keys between the partitioned tables.  Such foreign keys won't be a problem.

> Also then what I see is, it will make the data load query fail which uses "insert on conflict"
> to insert data into the table and that requires the primary key on both the columns to have
> on table level.

Yes, that is true.  A disadvantage of not having a unique constraint on the partitioned table.

> Also the partition maintenance job which uses partman extension uses the template table which
> in turn uses table level properties for creating new partitions and they will not have these
> unique indexes created for the new partitions as because the unique index property is not on
> the table level but partition level. Can you share your thoughts on these?

Don't use partman.  Or if you do, create the primary key yourself, after partman has created
the partition.
I wouldn't let the limitations of a tool govern my design choices.

Yours,
Laurenz Albe



Re: Partitioning and unique key

From
veem v
Date:

On Tue, 3 Sept 2024 at 01:14, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

You can keep the primary key defined on both columns if it is good enough for you.
But it will give you lower guarantees of uniqueness: with that primary key, there could
be two rows with a different timestamp, but the same "txn_id", and these two rows could
be in the same partition...

Also, if you need a foreign key pointing *to* the partitioned table, you cannot do without
a primary key.  But I recommend that you do *not* define such foreign keys: they will make
it more difficult to detach a partition.

If you partition two tables in the same way, you can use foreign keys between the partitions
instead of foreign keys between the partitioned tables.  Such foreign keys won't be a problem.


Thank You so much.
As you rightly said "they will make it more difficult to detach a partition." , we are really seeing a longer time when detaching parent table partitions.It runs forever sometimes. So do you mean it's because we have primary key defined table level or it's because we have FK defined in table level (for multiple child tables which are also partitioned)?

 We were thinking it's because we have FK defined on tablelevel , so we were planning to make the FK on partition level. But as you just pointed now , even keeping the PK on table level will also make the detach partition slow? I understand, for detaching partitions , it may be scanning while child because of the FK defined on the table level. but i am unable to understand how the table level PK impacts the detach partition from parent here.

My understanding is PK can only be created on table level but not on the partition level. On the partition level we only can have a "unique index" defined. Correct me if my understanding is wrong.

Re: Partitioning and unique key

From
Laurenz Albe
Date:
On Tue, 2024-09-03 at 10:39 +0530, veem v wrote:
> As you rightly said "they will make it more difficult to detach a partition." ,
> we are really seeing a longer time when detaching parent table partitions.
> It runs forever sometimes. So do you mean it's because we have primary key
> defined table level or it's because we have FK defined in table level
> (for multiple child tables which are also partitioned)?

I'd say it is because of the foreign key.

If you have a foreign key that points to a partitioned table, and you detach
a partition, PostgreSQL has to verify that that won't violate the constraint,
so it has to scan the tables, which will take time if the partitions are large.

> We were thinking it's because we have FK defined on tablelevel , so we were
> planning to make the FK on partition level.

Good move.

> But as you just pointed now , even keeping the PK on table level will also
> make the detach partition slow? I understand, for detaching partitions ,
> it may be scanning while child because of the FK defined on the table level.
> but i am unable to understand how the table level PK impacts the detach
> partition from parent here.

No, a primary key on the partitioned table won't be a problem for performance.

My concern was that if what you really would like is "id" to be unique, how does
a primary key on (id, some_timestamp) benefit you?

> My understanding is PK can only be created on table level but not on the
> partition level. On the partition level we only can have a "unique index"
> defined. Correct me if my understanding is wrong.

No, you can define a primary key on the partition.  That is, if you have no
primary key on the partitioned table.  A primary key on the partitioned table
is a primary key on each partition, and a table can only have a single primary
key, so adding another primary key on the partition would cause an error.

Yours,
Laurenz Albe