Home > mailing lists

concurrent re-partitioning of declarative partitioned tables - Mailing list pgsql-general

From	Nick Cleaton
Subject	concurrent re-partitioning of declarative partitioned tables
Date	November 30, 2020 18:36:23
Msg-id	CAFgz3kvPnyX8Npj65Wp1PqXNkgrX_den6z0RQc5MbSGPPFPozw@mail.gmail.com Whole thread Raw
Responses	Re: concurrent re-partitioning of declarative partitioned tables Re: concurrent re-partitioning of declarative partitioned tables
List	pgsql-general

Tree view

I want to set up a large table on postgresql 12.4, using declarative partitioning to partition by record creation date. I'd like to have recent records in small partitions but old records in a few larger partitions, so I want merges. The merges should be concurrent, in the sense that they lock out readers or writers only for very short intervals if at all.

I'm looking at adding an extra boolean column and partitioning on that at the top level, with two parallel date-based partition trees underneath, so that I can effectively have overlapping date partitions:

create table mytable (

record_date timestamp with time zone not null,

_partition_channel boolean,

...

)

partition by list (_partition_channel);

create table mytable_chan_null

partition of mytable for values in (null)

partition by range (record_date);

create table mytable_chan_true

partition of mytable for values in (true)

partition by range (record_date);

create table mytable_day_20200101

partition of mytable_chan_null

for values from ('2020-01-01') to ('2020-01-02');

...

create table mytable_day_20200107

partition of mytable_chan_null

for values from ('2020-01-07') to ('2020-01-08');

Then to merge several day-partitions into a week-partition:

create table mytable_week_20200101

partition of mytable_chan_true

for values from ('2020-01-01') to ('2020-01-08');

... and migrate rows in batches by updating _partition_channel to true, then finally drop the empty day partitions.

Since record_date is an insertion timestamp, I don't mind that after this merge updating the record_date of a merged row could fail due to a missing partition. Likewise there's no need for new rows to be inserted with record_date values in previously merged ranges.

Questions:

Are there any hidden pitfalls with this approach ?

Have I missed a simpler way ?

Is there a project out there that will manage this for me ?

pgsql-general by date:

From: Thibaut Madelaine
Date: 30 November 2020, 16:29:58
Subject: Re: Migration from SQL Server to PostgeSQL

From: Laura Smith
Date: 30 November 2020, 20:43:22
Subject: Re: Storage and querying of filesystem paths

concurrent re-partitioning of declarative partitioned tables - Mailing list pgsql-general

Previous

Next