Partition column should be part of PK - Mailing list pgsql-performance

From Nagaraj Raj
Subject Partition column should be part of PK
Date
Msg-id 836265572.4593017.1624590607279@mail.yahoo.com
Whole thread Raw
Responses Re: Partition column should be part of PK
Re: Partition column should be part of PK
Performance benchmark of PG
List pgsql-performance

we have some partitioned tables with inherence and planning to migrate them to the declaration.

Table DDL:

CREATE TABLE c_account_p

(

    billing_account_guid character varying(40)  NOT NULL,

    ingestion_process_id bigint NOT NULL DEFAULT '-1'::integer,

    load_dttm timestamp(6) without time zone NOT NULL,

    ban integer NOT NULL,

    CONSTRAINT billing_account_pkey PRIMARY KEY (billing_account_guid, ban)

) PARTITION by RANGE(load_dttm);

When I try the create table, it's throwing below error:

ERROR:  insufficient columns in the PRIMARY KEY constraint definition

DETAIL:  PRIMARY KEY constraint on table "l_billing_account_p" lacks column "load_dttm" which is part of the partition key.

SQL state: 0A000

Is it mandatory/necessary that the partition column should be a primary key? cause if I include load_dttm as PK then it's working fine.db<>fiddle


If the partition column should be supposed to be a PK, it's challenging to create a partition by range with the date column, cause the load_dttm column chances to have duplicate if data loaded COPY or INSERT.
INSERT INTO c_account_p SELECT * from c_account_p_bkp ON CONFLICT (billing_account_guid,ban,load_dttm) DO UPDATE SET 'some stuff..'

If I receive billing_account_guid, ban combination with different load_dttm then it will end up with duplicate keys.


Could some please help me to understand this scenario?

Thanks.

 



pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Planning performance problem (67626.278ms)
Next
From: Justin Pryzby
Date:
Subject: Re: Partition column should be part of PK