Thread: Most proper partitioning form on an integer column

Most proper partitioning form on an integer column

From

Il Mimo di Creta

Date:

18 April 2021, 18:07:35

Hello,

I need to partition a table on an integer column, which represents the month of an event, so 12 distinct values.

I am just wondering if any of you has experience about which is the best way to go with such a use case, in particular which method pick up between range, list and hash.

Thank you very much in avance for your help

Mimo

Re: Most proper partitioning form on an integer column

From

Justin Pryzby

Date:

18 April 2021, 19:28:12

On Sun, Apr 18, 2021 at 08:07:35PM +0200, Il Mimo di Creta wrote:
> I need to partition a table on an integer column, which represents the
> month of an event, so 12 distinct values.
> I am just wondering if any of you has experience about which is the best
> way to go with such a use case, in particular which method pick up between
> range, list and hash.

The partition key you should choose is the one which optimizes your queries -
(loading and/or reporting).

How many months of data (tables) will you have ?
What does a typical insert/load query look like ?
What does a typical report query look like ?
What does a typical query look like to "prune" old data ?

I think having a separate column for "month" may be a bad idea.  Consider a
timestamptz column instead, and use EXTRACT('month') (or a view or a GENERATED
column).  See here for a query that worked poorly for exactly that reason:
https://www.postgresql.org/message-id/20180128175110.GA18115@telsasoft.com

Then, I think you'd use RANGE partitioning on the timestamp column by month:
FOR VALUES FROM ('2021-04-18 04:00:00-08') TO ('2021-04-18 05:00:00-08')

Otherwise, you might still want to also include the year in the partition key.
Either with multiple columns in the key (PARTITION BY RANGE (year, month)), or
sub-partitioning.  Otherwise, you have no good way to prune old data - avoiding
DELETE is a major benefit to partitioning.

-- 
Justin