Re: Question related to partitioning with pg_partman - Mailing list pgsql-general

From Adrian Klaver
Subject Re: Question related to partitioning with pg_partman
Date
Msg-id 14f28b7a-9ea4-49b2-9000-b88dbe0332ec@aklaver.com
Whole thread Raw
In response to Re: Question related to partitioning with pg_partman  (sud <suds1434@gmail.com>)
List pgsql-general
On 3/10/24 11:34, sud wrote:
> 
> On Sun, Mar 10, 2024 at 11:31 PM Adrian Klaver 
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
> 
>     1) The partition will be across one day(24 hours) it is just the times
>     may confuse people. Per you example 2024-03-07 00:00:00+00  is the same
>     time as 2024-03-06 19:00:00-05 for EST. The issue is that the +00 and
>     -05 maybe ignored. Also it depends on the clients being consistent in
>     using timestamptz.
> 
>     2) You still have not answered what the datetime range(not date range)
>     is that will be queried. If you have the partitions Midnight to
>     Midnight
>     UTC and the clients are querying Midnight to Midnight local time the
>     query will not match the partitions.
> 
> 
>   My apology if not able to clearly put the details. Actually, the query 
> will always happen on a day basis i.e they can query from one day to 15 
> days transactions. But as you rightly pointed , the partitions can only 
> span from midnight to midnight in one timezone, and thus users who 
> queries the data from another time zone will mostly scan two partitions 
> (even if they just queries one days transaction data in their own 
> timezone). And I don't see an easy solution for this , which will help 
> all users across all time zones to scan only a single partition in the 
> database, when they queries data for a single transaction date.
> 
> And thus my question was, is it necessary to have the creation of 
> partitions to happen on UTC time zone only? and then whatever 
> transaction data inserted by the users from respective time zones will 
> be stored in the database as is and will be queried based on the user 
> timezone (it may span across multiple partitions though for a single 
> user transaction date).


This is going to depend on many things.

1) Partitions are not free they have overhead, which is fine if the 
cost(overhead) is less then the benefits. For details on that see:

https://www.postgresql.org/docs/current/ddl-partitioning.html

and partition parts of

https://www.postgresql.org/docs/current/sql-createtable.html

As part of this there is the consideration of whether daily partitions 
are really what you want?

2) What you hope to get out of the partitioning?

a) If it is confining queries to the partition boundaries then you have 
already stated that is not going to happen.

b) If it is for data pruning purposes, then you have something to 
consider on both ends. Creating/dropping partitions with Midnight to 
Midnight UTC means you will need to consider whether they cover the 
range of datetimes that your users are interested in. In other words 
creating a partition ahead that covers local times that resolve to a UTC 
time in the 'future'. On the back end not dropping a partition until it 
has gone out of scope for everybody.

To answer 1 & 2 you are probably going to need to create a test setup 
and verify how the expected queries are actually going to work with your 
partition scheme.



-- 
Adrian Klaver
adrian.klaver@aklaver.com




pgsql-general by date:

Previous
From: sud
Date:
Subject: Re: Question related to partitioning with pg_partman
Next
From: Ilya Basin
Date:
Subject: Re: Query plans not identical for `id in(1,2,3)` and `(id=1 or id=2 or id=3)`