Re: Querying a time range across multiple partitions - Mailing list pgsql-general

From John R Pierce
Subject Re: Querying a time range across multiple partitions
Date
Msg-id 540CBE14.2000305@hogranch.com
Whole thread Raw
In response to Re: Querying a time range across multiple partitions  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: Querying a time range across multiple partitions
List pgsql-general
On 9/7/2014 12:55 PM, Jeff Janes wrote:
On Fri, Sep 5, 2014 at 2:17 PM, John R Pierce <pierce@hogranch.com> wrote:
On 9/5/2014 10:31 AM, Cal Heldenbrand wrote:
Number of child tables: 1581

that's an insane number of children.    We try and limit it to 50 or so child tables, for instance, 6 months retention by week, of data will millions of rows/day.

I've used more than that many for testing purposes, and there was little problem.  The main thing is that your insert trigger (if you have one on the master table) needs to be structured as a binary search-like nesting of if..elsif, not a linear-searching like structure.  Unless of course almost all inserts go into the newest partition, then it might make more sense to do the linear search with that being the first test.  But for performance, better to just insert directly into the correct child table.

any select that can't be preplanned to a specific child will need to check all 1500 children.     this is far less efficient than checking, say, 50 and letting the b-tree index of each child reject or narrow down to the specific row(s).  The one is roughly 1500*log(N/1500) while the other is 50*log(N/50) at least to a first order approximation.





-- 
john r pierce                                      37N 122W
somewhere on the middle of the left coast

pgsql-general by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Querying a time range across multiple partitions
Next
From: Sergey Burladyan
Date:
Subject: xlog min recovery request is past current point -- is it real problem?