Re: On partitioning - Mailing list pgsql-hackers

From Robert Haas
Subject Re: On partitioning
Date
Msg-id CA+TgmoYuRLE-H1c2sXh8KNnHca71FWg6w4N0Wg8Zvvp0m6_NHw@mail.gmail.com
Whole thread Raw
In response to Re: On partitioning  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
On Tue, Dec 16, 2014 at 9:01 PM, Josh Berkus <josh@agliodbs.com> wrote:
> On 12/16/2014 05:52 PM, Robert Haas wrote:
>> But in a more complicated case where the value there isn't known until
>> runtime, yeah, it scans everything.  I'm not sure what the best way to
>> fix that is.  If the partition bounds were stored in a structured way,
>> as we've been discussing, then the Append or Merge Append node could,
>> when initialized, check which partition the id = X qual routes to and
>> ignore the rest.  But that's more iffy with the current
>> representation, I think.
>
> Huh.  I was just testing:
>
> WHERE event_time BETWEEN timestamptz '2014-12-01' and ( timestamptz
> '2014-12-01' + interval '1 month')
>
> In that case, the expression above got folded to constants by the time
> Postgres did the index scans, but it scanned all partitions.  So somehow
> (timestamptz + interval) doesn't get constant-folded until after
> planning, at least not on 9.3.
>
> And of course this leaves out common patterns like "now() - interval '30
> days'" or "to_timestamp('20141201','YYYYMMDD')"
>
> Anyway, what I'm saying is that I personally regard the inability to
> handle even moderately complex expressions a major failing of our
> existing partitioning scheme (possibly its worst single failing), and I
> would regard any new partitioning feature which didn't address that
> issue as suspect.

I understand, but I think you need to be careful not to stonewall all
progress in the name of getting what you want.  Getting the
partitioning metadata into the system catalogs in a suitable format
will be a huge step forward regardless of whether it solves this
particular problem right away or not, because it will make it possible
to solve this problem in a highly-efficient way, which is quite hard
to do right now.

For example, we could (right now) write code that would do run-time
partition pruning by taking the final filter clause, with all values
substituted in, and re-checking for partitions that can be pruned via
constraint exclusion.  But that would be expensive and would often
fail to find anything useful.  Even in the best case where it works
out it's O(n) in the number of partitions, and will therefore perform
badly for large numbers of partitions (even, say, 1000).  But once the
partitioning metadata is stored in the catalog, we can implement this
as a binary search -- O(lg n) time -- and the constant factor should
be lower -- and it will be pretty easy to skip it in cases where it's
useless so that we don't waste cycles spinning our wheels.  Whether
the initial patch covers all the cases you care about or not, and it
probably won't, it will be a really big step towards making it
POSSIBLE to handle those cases.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Re: [Bug] Inconsistent result for inheritance and FOR UPDATE.
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Escaping from blocked send() reprised.