Thread: Most proper partitioning form on an integer column
Hello,
I need to partition a table on an integer column, which represents the month of an event, so 12 distinct values.
I am just wondering if any of you has experience about which is the best way to go with such a use case, in particular which method pick up between range, list and hash.
Thank you very much in avance for your help
Mimo
On Sun, Apr 18, 2021 at 08:07:35PM +0200, Il Mimo di Creta wrote: > I need to partition a table on an integer column, which represents the > month of an event, so 12 distinct values. > I am just wondering if any of you has experience about which is the best > way to go with such a use case, in particular which method pick up between > range, list and hash. The partition key you should choose is the one which optimizes your queries - (loading and/or reporting). How many months of data (tables) will you have ? What does a typical insert/load query look like ? What does a typical report query look like ? What does a typical query look like to "prune" old data ? I think having a separate column for "month" may be a bad idea. Consider a timestamptz column instead, and use EXTRACT('month') (or a view or a GENERATED column). See here for a query that worked poorly for exactly that reason: https://www.postgresql.org/message-id/20180128175110.GA18115@telsasoft.com Then, I think you'd use RANGE partitioning on the timestamp column by month: FOR VALUES FROM ('2021-04-18 04:00:00-08') TO ('2021-04-18 05:00:00-08') Otherwise, you might still want to also include the year in the partition key. Either with multiple columns in the key (PARTITION BY RANGE (year, month)), or sub-partitioning. Otherwise, you have no good way to prune old data - avoiding DELETE is a major benefit to partitioning. -- Justin