Thread: On Scalability

On Scalability

From
Vincenzo Romano
Date:
Hi all.
I'm wondering about PGSQL scalability.
In particular I have two main topics in my mind:

1. What'd be the behavior of the query planner in the case I have
a single huge table with hundreds or thousands of partial indexes
(just differing by the WHERE clause).
This is an idea of mine to make index-partitioning instead of
table-partitioning.

2. What'd be the behavior of the query planner in the case I have
hundreds or thousands of child tables, possibly in a multilevel hierarchy
(let's say, partitioning by year, month and company).

I fear the presence of linear selection algorithms in these two cases that
would kill my design.

Is there any insight about these two points?

-- 
NotOrAnd Information Technologies
Vincenzo Romano
--
NON QVIETIS MARIBVS NAVTA PERITVS


Re: On Scalability

From
"Joshua D. Drake"
Date:
On Thu, 2010-07-29 at 19:08 +0200, Vincenzo Romano wrote:
> Hi all.
> I'm wondering about PGSQL scalability.
> In particular I have two main topics in my mind:
>
> 1. What'd be the behavior of the query planner in the case I have
> a single huge table with hundreds or thousands of partial indexes
> (just differing by the WHERE clause).
> This is an idea of mine to make index-partitioning instead of
> table-partitioning.

Well the planner is not going to care about the partial indexes that
don't match the where clause but what you are suggesting is going to
make writes and maintenance extremely expensive. It will also increase
planning time as the optimizer at a minimum has to discard the use of
those indexes.

>
> 2. What'd be the behavior of the query planner in the case I have
> hundreds or thousands of child tables, possibly in a multilevel hierarchy
> (let's say, partitioning by year, month and company).

Again, test it. Generally speaking the number of child tables directly
correlates to planning time. Most experience that 60-100 tables is
really the highest you can go.

It all depends on actual implementation and business requirements
however.

Sincerely,

Joshua D. Drake


--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt

Re: On Scalability

From
Vincenzo Romano
Date:
2010/7/29 Joshua D. Drake <jd@commandprompt.com>:
> On Thu, 2010-07-29 at 19:08 +0200, Vincenzo Romano wrote:
>> Hi all.
>> I'm wondering about PGSQL scalability.
>> In particular I have two main topics in my mind:
>>
>> 1. What'd be the behavior of the query planner in the case I have
>> a single huge table with hundreds or thousands of partial indexes
>> (just differing by the WHERE clause).
>> This is an idea of mine to make index-partitioning instead of
>> table-partitioning.
>
> Well the planner is not going to care about the partial indexes that
> don't match the where clause but what you are suggesting is going to
> make writes and maintenance extremely expensive. It will also increase
> planning time as the optimizer at a minimum has to discard the use of
> those indexes.
>
>>
>> 2. What'd be the behavior of the query planner in the case I have
>> hundreds or thousands of child tables, possibly in a multilevel hierarchy
>> (let's say, partitioning by year, month and company).
>
> Again, test it. Generally speaking the number of child tables directly
> correlates to planning time. Most experience that 60-100 tables is
> really the highest you can go.
>
> It all depends on actual implementation and business requirements
> however.
>
> Sincerely,
>
> Joshua D. Drake

I expect that a more complex schema will imply higher workloads
on the query planner. What I don't know is how the increase in the
workload will happen: linearly, sublinearly, polinomially or what?

Significant testing would require a prototype implementation with
an almost complete feed of data from the current solution.
But I'm at the feasibility study stage and have not enough resources
for that.

Thanks anyway for the insights, Joshua.
Does the 60-100 tables limit applies to a single level
of inheritance? Or is it more general?

-- 
NotOrAnd Information Technologies
Vincenzo Romano
--
NON QVIETIS MARIBVS NAVTA PERITVS


Re: On Scalability

From
"Joshua D. Drake"
Date:
On Thu, 2010-07-29 at 19:34 +0200, Vincenzo Romano wrote:

> I expect that a more complex schema will imply higher workloads
> on the query planner. What I don't know is how the increase in the
> workload will happen: linearly, sublinearly, polinomially or what?
>
> Significant testing would require a prototype implementation with
> an almost complete feed of data from the current solution.
> But I'm at the feasibility study stage and have not enough resources
> for that.
>
> Thanks anyway for the insights, Joshua.
> Does the 60-100 tables limit applies to a single level
> of inheritance? Or is it more general?

I do not currently have experience (except that it is possible) with
multi-level inheritance and postgresql.

>

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt

Re: On Scalability

From
Vincenzo Romano
Date:
2010/7/29 Joshua D. Drake <jd@commandprompt.com>:
> On Thu, 2010-07-29 at 19:34 +0200, Vincenzo Romano wrote:
>
>> I expect that a more complex schema will imply higher workloads
>> on the query planner. What I don't know is how the increase in the
>> workload will happen: linearly, sublinearly, polynomially or what?

Do you think I should ask somewhere else?
Any hint?

>> Thanks anyway for the insights, Joshua.
>> Does the 60-100 tables limit applies to a single level
>> of inheritance? Or is it more general?
>
> I do not currently have experience (except that it is possible) with
> multi-level inheritance and postgresql.

Thanks anyway.

-- 
Vincenzo Romano at NotOrAnd Information Technologies
Software Hardware Networking Training Support Security
--
NON QVIETIS MARIBVS NAVTA PERITVS


Re: On Scalability

From
"Joshua D. Drake"
Date:
On Thu, 2010-07-29 at 19:52 +0200, Vincenzo Romano wrote:
> 2010/7/29 Joshua D. Drake <jd@commandprompt.com>:
> > On Thu, 2010-07-29 at 19:34 +0200, Vincenzo Romano wrote:
> >
> >> I expect that a more complex schema will imply higher workloads
> >> on the query planner. What I don't know is how the increase in the
> >> workload will happen: linearly, sublinearly, polynomially or what?
>
> Do you think I should ask somewhere else?
> Any hint?

The two people that would likely know the best are on vacation, TGL and
Heikki. You may have to wait a bit.

Sincerely,

Joshua D. Drake

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt

Re: On Scalability

From
Josh Berkus
Date:
> Do you think I should ask somewhere else?
> Any hint?

I might suggest asking on the pgsql-performance mailing list instead.
You'll get *lots* more speculation there.  However, the only way you're
really going to know is to test.


--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: On Scalability

From
Vincenzo Romano
Date:
2010/7/29 Josh Berkus <josh@agliodbs.com>:
>
>> Do you think I should ask somewhere else?
>> Any hint?
>
> I might suggest asking on the pgsql-performance mailing list instead.
> You'll get *lots* more speculation there.  However, the only way you're
> really going to know is to test.

Or maybe checking against the source code and its documentation, if any.

--
Vincenzo Romano at NotOrAnd Information Technologies
Software Hardware Networking Training Support Security
--
NON QVIETIS MARIBVS NAVTA PERITVS


Re: On Scalability

From
"Joshua D. Drake"
Date:
On Thu, 2010-07-29 at 19:08 +0200, Vincenzo Romano wrote:
> Hi all.
> I'm wondering about PGSQL scalability.
> In particular I have two main topics in my mind:
> 
> 1. What'd be the behavior of the query planner in the case I have
> a single huge table with hundreds or thousands of partial indexes
> (just differing by the WHERE clause).
> This is an idea of mine to make index-partitioning instead of
> table-partitioning.

Well the planner is not going to care about the partial indexes that
don't match the where clause but what you are suggesting is going to
make writes and maintenance extremely expensive. It will also increase
planning time as the optimizer at a minimum has to discard the use of
those indexes.

> 
> 2. What'd be the behavior of the query planner in the case I have
> hundreds or thousands of child tables, possibly in a multilevel hierarchy
> (let's say, partitioning by year, month and company).

Again, test it. Generally speaking the number of child tables directly
correlates to planning time. Most experience that 60-100 tables is
really the highest you can go.

It all depends on actual implementation and business requirements
however.

Sincerely,

Joshua D. Drake


-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt



Re: On Scalability

From
"Joshua D. Drake"
Date:
On Thu, 2010-07-29 at 19:34 +0200, Vincenzo Romano wrote:

> I expect that a more complex schema will imply higher workloads
> on the query planner. What I don't know is how the increase in the
> workload will happen: linearly, sublinearly, polinomially or what?
> 
> Significant testing would require a prototype implementation with
> an almost complete feed of data from the current solution.
> But I'm at the feasibility study stage and have not enough resources
> for that.
> 
> Thanks anyway for the insights, Joshua.
> Does the 60-100 tables limit applies to a single level
> of inheritance? Or is it more general?

I do not currently have experience (except that it is possible) with
multi-level inheritance and postgresql.

> 

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt



Re: On Scalability

From
"Joshua D. Drake"
Date:
On Thu, 2010-07-29 at 19:52 +0200, Vincenzo Romano wrote:
> 2010/7/29 Joshua D. Drake <jd@commandprompt.com>:
> > On Thu, 2010-07-29 at 19:34 +0200, Vincenzo Romano wrote:
> >
> >> I expect that a more complex schema will imply higher workloads
> >> on the query planner. What I don't know is how the increase in the
> >> workload will happen: linearly, sublinearly, polynomially or what?
> 
> Do you think I should ask somewhere else?
> Any hint?

The two people that would likely know the best are on vacation, TGL and
Heikki. You may have to wait a bit.

Sincerely,

Joshua D. Drake

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt



Re: On Scalability

From
Vincenzo Romano
Date:
2010/7/29 Josh Berkus <josh@agliodbs.com>:
>
>> Or maybe checking against the source code and its documentation, if any.
>
> No, not really.  What you really want to know is: what's the real
> planner overhead of having dozens/hundreds of partial indexes?  What's
> the write overhead?  There's no way you can derive that from the source
> code faster than you can test it.

Again, as the test would be rather killing for my group at this stage.

I think that knowing whether certain parts have been implemented
with linear or sub-linear (or whatever else) algorithms would
give good insights about scalability.

At a first glance it seems that for inheritance some bottleneck is
hindering a full exploit for table partitioning.

Is there anyone who knows whether those algorithms are linear or not?

And of course, I agree that real tests on real data will provide the real thing.

--
Vincenzo Romano at NotOrAnd Information Technologies
Software Hardware Networking Training Support Security
--
NON QVIETIS MARIBVS NAVTA PERITVS


Re: On Scalability

From
Greg Stark
Date:
On Fri, Jul 30, 2010 at 11:24 AM, Vincenzo Romano
<vincenzo.romano@notorand.it> wrote:
> At a first glance it seems that for inheritance some bottleneck is
> hindering a full exploit for table partitioning.

There have been lengthy discussions of how to implement partitioning
to fix these precise problems, yes.


> Is there anyone who knows whether those algorithms are linear or not?

They're linear in both cases. But they happen at plan time rather than
query execution time. So if your application prepares all its queries
and then uses them many times it would not slow down query execution
but would slow down the query planning time. In some applications this
is much better but in others unpredictable run-times is as bad as long
run-times.

Also in the case of having many partial indexes it would slow down
inserts and updates as well, though to a lesser degree, and that would
happen at execution time.


-- 
greg


Re: On Scalability

From
Vincenzo Romano
Date:
2010/7/30 Greg Stark <gsstark@mit.edu>:
> On Fri, Jul 30, 2010 at 11:24 AM, Vincenzo Romano
> <vincenzo.romano@notorand.it> wrote:
>> At a first glance it seems that for inheritance some bottleneck is
>> hindering a full exploit for table partitioning.
>
> There have been lengthy discussions of how to implement partitioning
> to fix these precise problems, yes.

Any reference?

>> Is there anyone who knows whether those algorithms are linear or not?
>
> They're linear in both cases. But they happen at plan time rather than
> query execution time. So if your application prepares all its queries
> and then uses them many times it would not slow down query execution
> but would slow down the query planning time. In some applications this
> is much better but in others unpredictable run-times is as bad as long
> run-times.

Hmmm ... maybe I'm missing the inner meaning of your remarks, Greg.
By using PREPARE I run the query planned sooner and I should use
the plan with the later execution.
You can bet that some of the PREPAREd query variables will
pertain to either the child table's CHECK contraints (for table partitions)
or to the partial index's WHERE condition (for index partitioning).

It's exactly this point (execution time) where the "linearity" will
kill the query
over a largely partitioned table.

Is this what you meant?  :-)

> Also in the case of having many partial indexes it would slow down
> inserts and updates as well, though to a lesser degree, and that would
> happen at execution time.

This makes fully sense to me.


-- 
Vincenzo Romano at NotOrAnd Information Technologies
Software Hardware Networking Training Support Security
--
NON QVIETIS MARIBVS NAVTA PERITVS


Re: On Scalability

From
Josh Berkus
Date:
> Is there anyone who knows whether those algorithms are linear or not?

Read the code?  It's really very accessible, and there's lots and lots
of comments.  While the person who wrote the code is around, isn't it
better to see the real implementation?

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: On Scalability

From
Vincenzo Romano
Date:
2010/7/30 Josh Berkus <josh@agliodbs.com>:
>
>> Is there anyone who knows whether those algorithms are linear or not?
>
> Read the code?  It's really very accessible, and there's lots and lots
> of comments.  While the person who wrote the code is around, isn't it
> better to see the real implementation?

If the programmer(s) who wrote that part is around, a simple hint would suffice.
Even an hint to where look into the code would be very appreciated: the query
planner is not as simple as the "ls" command (which is not that simple any
more, though).

It looks like I need to go the hard way ...
Starting from postgresql-8.4.4/src/backend/optimizer

--
Vincenzo Romano at NotOrAnd Information Technologies
Software Hardware Networking Training Support Security
--
cel +393398083886 fix +390823454163 fax +3902700506964
gtalk. vincenzo.romano@notorand.it skype. notorand.it
--
NON QVIETIS MARIBVS NAVTA PERITVS


Re: On Scalability

From
Greg Smith
Date:
Vincenzo Romano wrote:
> By using PREPARE I run the query planned sooner and I should use
> the plan with the later execution.
> You can bet that some of the PREPAREd query variables will
> pertain to either the child table's CHECK contraints (for table partitions)
> or to the partial index's WHERE condition (for index partitioning).
>   

Prepared statements are not necessarily a cure for long query planning 
time, because the sort of planning decisions made with partitioned child 
tables and index selection can need to know the parameter values to 
execute well; that's usually the situation rather than the exception 
with partitions.  You run the risk that the generic prepared plan will 
end up looking at all the partitions, because at preparation plan time 
it can't figure out which can be excluded.  Can only figure that out 
once they're in there for some types of queries.

I think you aren't quite lined up with the people suggesting "test it" 
in terms of what that means.  The idea is not that you should build a 
full on application test case yet, which can be very expensive.  The 
idea is that you might explore things like "when I partition this way 
increasing the partitions from 1 to n, does query time go up linearly?" 
by measuring with fake data and a machine-generated schema.  What's 
happened in some of these cases is that, despite the theoretical, some 
constant or external overhead ends up dominating behavior for lower 
numbers.  As an example, it was recognized that the amount of statistics 
for a table collected with default_statistics_target had a quadratic 
impact on some aspects of performance.  But it turned out that for the 
range of interesting values to most people, the measured runtime did not 
go up with the square as feared.  Only way that was sorted out was to 
build a simple simulation.

Here's a full example from that discussion that shows the sort of tests 
you probably want to try, and comments on the perils of guessing based 
on theory rather than testing:

http://archives.postgresql.org/pgsql-hackers/2008-12/msg00601.php
http://archives.postgresql.org/pgsql-hackers/2008-12/msg00687.php

generate_series can be very helpful here, and you can even use that to 
generate timestamps if you need them in the data set.

That said, anecdotally everyone agrees that partitions don't scale well 
into even the very low hundreds for most people, and doing multi-level 
ones won't necessarily normally drop query planning time--just the cost 
of maintaining the underlying tables and indexes.  My opinion is that 
building a simple partitioned case and watching how the EXPLAIN plans 
change as you adjust things will be more instructive for you than either 
asking about it or reading the source.  Vary the parameters, watch the 
plans, measure things and graph them if you want to visualize the 
behavior better.  Same thing goes for large numbers of partial indexes, 
which have a similar query planning impact, but unlike partitions I 
haven't seen anyone analyze them via benchmarks.  I'm sure you could get 
help here (probably the performance list is a better spot though) with 
getting your test case right if you wanted to try and nail that down.

-- 
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us



Re: On Scalability

From
Robert Haas
Date:
On Fri, Jul 30, 2010 at 3:50 PM, Vincenzo Romano
<vincenzo.romano@notorand.it> wrote:
> 2010/7/30 Josh Berkus <josh@agliodbs.com>:
>>
>>> Is there anyone who knows whether those algorithms are linear or not?
>>
>> Read the code?  It's really very accessible, and there's lots and lots
>> of comments.  While the person who wrote the code is around, isn't it
>> better to see the real implementation?
>
> If the programmer(s) who wrote that part is around, a simple hint would suffice.
> Even an hint to where look into the code would be very appreciated: the query
> planner is not as simple as the "ls" command (which is not that simple any
> more, though).
>
> It looks like I need to go the hard way ...
> Starting from postgresql-8.4.4/src/backend/optimizer

I think you're approaching this in the wrong way.  You've repeatedly
said you don't want to do all the work of setting up a test, but
trying to search the code for algorithms that might not be linear is
not going to be easier.  I've been reading this thread and I'm fairly
familiar with this code, and I even understand the algorithms pretty
well, and I don't know whether they're going to be linear for what you
want to do or not.  Certainly, the overall task of join planning is
exponential-time in the number of *tables*, but if you're just doing
SELECT statements on a single table, will that be linear?  Tough to
say.  Certainly, there are a lot of linked lists in there, so if we
have any place where we have two nested loops over the list of
indices, it won't be linear.  I can't think of a place where we do
that, but that doesn't mean there isn't one.  And even if they are
linear, or n log n or something, the coefficients could still be
lousy.  Theoretical computer science is one of my favorite subjects,
but, it doesn't always tell you what you want to know about the real
world.

It doesn't seem like it should be very hard to figure this out
empirically.  Just create a big database full of random data.  Maybe
you could populate one of the columns with something like (random() *
1000)::int.  Then you could create partial indices ON
(some_other_column) WHERE that_column = <blat> for <blat> in 0..999.
Then you could run some test queries and see how you make out.

Or, I mean, you can read the source code.  That's fine, too.  It's
just... I've already read the source code quite a few times, and I
still don't know the answer.  Admittedly, I wasn't trying to answer
this specific question, but still - I don't think it's an easy
question to answer that way.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: On Scalability

From
Vincenzo Romano
Date:
Any feedbacks from TGL and Heikki, then?

2010/7/29 Joshua D. Drake <jd@commandprompt.com>:
> On Thu, 2010-07-29 at 19:52 +0200, Vincenzo Romano wrote:
>> 2010/7/29 Joshua D. Drake <jd@commandprompt.com>:
>> > On Thu, 2010-07-29 at 19:34 +0200, Vincenzo Romano wrote:
>> >
>> >> I expect that a more complex schema will imply higher workloads
>> >> on the query planner. What I don't know is how the increase in the
>> >> workload will happen: linearly, sublinearly, polynomially or what?
>>
>> Do you think I should ask somewhere else?
>> Any hint?
>
> The two people that would likely know the best are on vacation, TGL and
> Heikki. You may have to wait a bit.
>
> Sincerely,
>
> Joshua D. Drake
>
> --
> PostgreSQL.org Major Contributor
> Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
> Consulting, Training, Support, Custom Development, Engineering
> http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt
>
>



-- 
Vincenzo Romano at NotOrAnd Information Technologies
Software Hardware Networking Training Support Security
--
cel +393398083886 fix +390823454163 fax +3902700506964
gtalk. vincenzo.romano@notorand.it skype. notorand.it
--
NON QVIETIS MARIBVS NAVTA PERITVS


Re: On Scalability

From
Heikki Linnakangas
Date:
On 07.10.2010 10:09, Vincenzo Romano wrote:
> Any feedbacks from TGL and Heikki, then?

I don't have anything to add to what others said already. Your best 
advice is to test it yourself.

I would expect the plan time to be linear relative to the number of 
partial indexes or child tables involved, except that constraint 
exclusion of CHECK constraints on the partitions is exponential. But I 
also wouldn't be surprised if there's some other non-linear aspect there 
that shows its head with thousands of partitions.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: On Scalability

From
Simon Riggs
Date:
On Thu, 2010-10-07 at 10:28 +0300, Heikki Linnakangas wrote:

> constraint exclusion of CHECK constraints on the partitions is
> exponential

Constraint exclusion is linear with respect to number of partitions.
Why do you say exponential?
-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services



Re: On Scalability

From
Heikki Linnakangas
Date:
On 07.10.2010 10:41, Simon Riggs wrote:
> On Thu, 2010-10-07 at 10:28 +0300, Heikki Linnakangas wrote:
>
>> constraint exclusion of CHECK constraints on the partitions is
>> exponential
>
> Constraint exclusion is linear with respect to number of partitions.
> Why do you say exponential?

For some reason I thought the planner needs to check the constraints of 
the partitions against each other, but you're right, clearly that's not 
the case. Linear it is.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: On Scalability

From
Vincenzo Romano
Date:
2010/10/7 Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>:
> On 07.10.2010 10:41, Simon Riggs wrote:
>>
>> On Thu, 2010-10-07 at 10:28 +0300, Heikki Linnakangas wrote:
>>
>>> constraint exclusion of CHECK constraints on the partitions is
>>> exponential
>>
>> Constraint exclusion is linear with respect to number of partitions.
>> Why do you say exponential?
>
> For some reason I thought the planner needs to check the constraints of the
> partitions against each other, but you're right, clearly that's not the
> case. Linear it is.
>
> --
>  Heikki Linnakangas
>  EnterpriseDB   http://www.enterprisedb.com
>

Making these things sub-linear (whether not O(log n) or even O(1) ),
provided that there's  way to, would make this RDBMS more appealing
to enterprises.
I mean also partial indexes (as an alternative to table partitioning).
Being able to effectively cope with "a dozen child tables or so" it's more
like an amateur feature.
If you really need partitioning (or just hierarchical stuff) I think you'll need
for quite more than a dozen items.
If you partition by just weeks, you'll need 50+ a year.

Is there any precise direction to where look into the code for it?

Is there a way to put this into a wish list?

--
Vincenzo Romano at NotOrAnd Information Technologies
Software Hardware Networking Training Support Security
--
NON QVIETIS MARIBVS NAVTA PERITVS


Re: On Scalability

From
Robert Haas
Date:
On Thu, Oct 7, 2010 at 8:10 AM, Vincenzo Romano
<vincenzo.romano@notorand.it> wrote:
> Making these things sub-linear (whether not O(log n) or even O(1) ),
> provided that there's  way to, would make this RDBMS more appealing
> to enterprises.
> I mean also partial indexes (as an alternative to table partitioning).
> Being able to effectively cope with "a dozen child tables or so" it's more
> like an amateur feature.
> If you really need partitioning (or just hierarchical stuff) I think you'll need
> for quite more than a dozen items.
> If you partition by just weeks, you'll need 50+ a year.
>
> Is there any precise direction to where look into the code for it?
>
> Is there a way to put this into a wish list?

Well, you can't just arbitrarily turn a O(n) algorithm into an O(lg n)
algorithm.  I think the most promising approach to scaling to large
numbers of partitions is the patch that Itagaki Takahiro was working
on back in July.  Unfortunately, that patch still needs a lot of work
- and some redesign - before it will really meet our needs.  Right
now, the way to set up partitioning is to create a parent table and
then create a bunch of child tables that inherit from them and then
put mutually exclusive CHECK constraints on all the children and make
sure constraint_exclusion is on so that the planner can notice when
not all children need to be scanned.  As a totally general
architecture, this is probably hard to beat (or to make sublinear).
However, if we have DDL that allows the user to say: this is a set of
child tables that are range partitions on this key column, with these
boundaries, then you should be able to make the constraint exclusion
calculations much more efficient, because it won't have to infer so
much from first principles.  O(lg n) doesn't seem out of the question
given that architecture.

I think, though, that that is still some way off.  If you're in a
position to help with (or fund) the coding, it can be made to happen
faster, of course.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: On Scalability

From
Tom Lane
Date:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> On 07.10.2010 10:41, Simon Riggs wrote:
>> Constraint exclusion is linear with respect to number of partitions.
>> Why do you say exponential?

> For some reason I thought the planner needs to check the constraints of 
> the partitions against each other, but you're right, clearly that's not 
> the case. Linear it is.

Well, it's really more like O(mn) where m is the number of partitions
and n is the number of clauses in the query --- and not only that, but
the O() notation is hiding a depressingly high constant factor.  And
then there are practical problems like failing to exclude partitions as
soon as there are any parameters in the query.

There's basically no way that we're going to get decent performance for
large numbers of partitions as long as we have to resort to
theorem-proving to lead us to the correct partition.
        regards, tom lane


Re: On Scalability

From
Vincenzo Romano
Date:
2010/10/7 Robert Haas <robertmhaas@gmail.com>:
> Well, you can't just arbitrarily turn a O(n) algorithm into an O(lg n)

That's trivially true. I was not asking for the recipe to do it.

> algorithm.  I think the most promising approach to scaling to large
> numbers of partitions is the patch that Itagaki Takahiro was working
> on back in July.  Unfortunately, that patch still needs a lot of work
> - and some redesign - before it will really meet our needs.  Right
> now, the way to set up partitioning is to create a parent table and
> then create a bunch of child tables that inherit from them and then
> put mutually exclusive CHECK constraints on all the children and make
> sure constraint_exclusion is on so that the planner can notice when
> not all children need to be scanned.  As a totally general
> architecture, this is probably hard to beat (or to make sublinear).

This is exactly what's described into the official documentation.
Everyone I ask information about before going deeper in test I get
the same answer: don't try to use more than a dozen child tables.

> However, if we have DDL that allows the user to say: this is a set of
> child tables that are range partitions on this key column, with these
> boundaries, then you should be able to make the constraint exclusion
> calculations much more efficient, because it won't have to infer so
> much from first principles.  O(lg n) doesn't seem out of the question
> given that architecture.

I see the main problem in the way the planner "understands" which partition
is useful and which one is not.
Having the DDL supporting the feature could just be syntactic sugar
if the underlying mechanism is inadequate.

> I think, though, that that is still some way off.  If you're in a
> position to help with (or fund) the coding, it can be made to happen
> faster, of course.

This is why I was asking for directions: brwosing the whole code to look for the
relevant stuff is quite time consuming.

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company

--
Vincenzo Romano at NotOrAnd Information Technologies
Software Hardware Networking Training Support Security
--
NON QVIETIS MARIBVS NAVTA PERITVS


Re: On Scalability

From
Vincenzo Romano
Date:
2010/10/7 Tom Lane <tgl@sss.pgh.pa.us>:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> On 07.10.2010 10:41, Simon Riggs wrote:
>>> Constraint exclusion is linear with respect to number of partitions.
>>> Why do you say exponential?
>
>> For some reason I thought the planner needs to check the constraints of
>> the partitions against each other, but you're right, clearly that's not
>> the case. Linear it is.
>
> Well, it's really more like O(mn) where m is the number of partitions
> and n is the number of clauses in the query --- and not only that, but
> the O() notation is hiding a depressingly high constant factor.  And
> then there are practical problems like failing to exclude partitions as
> soon as there are any parameters in the query.


Does the same considerations apply to partial indexes?
I mean, I can replace table partitioning with index partitioning concept.
(Well I know it's not really the same).
Would then it be the same O(nm) to let the planner choose the right indexes
given a certain query?

> There's basically no way that we're going to get decent performance for
> large numbers of partitions as long as we have to resort to
> theorem-proving to lead us to the correct partition.
>
>                        regards, tom lane
>

I'm not sure about MySQL, but Oracle can handle large partitioning.
So I would say there's a way to achieve the same goal.

--
Vincenzo Romano at NotOrAnd Information Technologies
Software Hardware Networking Training Support Security
--
cel +393398083886 fix +390823454163 fax +3902700506964
gtalk. vincenzo.romano@notorand.it skype. notorand.it
--
NON QVIETIS MARIBVS NAVTA PERITVS


Re: On Scalability

From
Stephen Frost
Date:
* Vincenzo Romano (vincenzo.romano@notorand.it) wrote:
> I see the main problem in the way the planner "understands" which partition
> is useful and which one is not.
> Having the DDL supporting the feature could just be syntactic sugar
> if the underlying mechanism is inadequate.

I'm pretty sure the point with the DDL would be to have a way for the
user to communicate to the planner more understanding about the
partitioning, not just to be syntactic sugar.  With that additional
information, the planner can make a faster and better decision.
Stephen

Re: On Scalability

From
Greg Smith
Date:
Vincenzo Romano wrote:
> I see the main problem in the way the planner "understands" which partition
> is useful and which one is not.
> Having the DDL supporting the feature could just be syntactic sugar
> if the underlying mechanism is inadequate.
>   

You have the order of this backwards.  In order to do better than the 
way the current scheme is implemented, the optimizer needs higher 
quality metadata about the structure of the partitions to work with.  
Right now, it's inferring them from the CHECK constraints, which 
requires the whole theorem-proving bit Tom mentioned.  That's never 
going to get any more algorithmically efficient than it already is.

If the DDL that created the partitions also made better quality metadata 
available about the structure of the partitions, at that point it would 
be possible to also do better in how the optimizer pruned partitions to 
consider too.  If the list it has was known to be in a particular 
structured/sorted order, the optimizer could do a binary search to find 
relevant partitions, rather than the linear scan required right now.

Until that work is done, any other improvement attempts are doomed to 
fail.  That's the point Robert was trying to make to you.  And the fact 
Oracle does this is why it's able to scale to high partition counts 
better than PostgreSQL can.

You can read more about the work that was being done here at 
http://wiki.postgresql.org/wiki/Table_partitioning

-- 
Greg Smith, 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services and Support  www.2ndQuadrant.us




Re: On Scalability

From
Vincenzo Romano
Date:
2010/10/7 Stephen Frost <sfrost@snowman.net>:
> * Vincenzo Romano (vincenzo.romano@notorand.it) wrote:
>> I see the main problem in the way the planner "understands" which partition
>> is useful and which one is not.
>> Having the DDL supporting the feature could just be syntactic sugar
>> if the underlying mechanism is inadequate.
>
> I'm pretty sure the point with the DDL would be to have a way for the
> user to communicate to the planner more understanding about the
> partitioning, not just to be syntactic sugar.  With that additional
> information, the planner can make a faster and better decision.
>
>        Stephen

Which kind of information are you thinking about?
I think that the stuff you put into the CHECK condition for the table
will say it all.
Infact there you have not just the column names with relevant values, but the
actual expression(s) to be checked,

--
Vincenzo Romano at NotOrAnd Information Technologies
Software Hardware Networking Training Support Security
--
NON QVIETIS MARIBVS NAVTA PERITVS


Re: On Scalability

From
Stephen Frost
Date:
* Vincenzo Romano (vincenzo.romano@notorand.it) wrote:
> Which kind of information are you thinking about?
> I think that the stuff you put into the CHECK condition for the table
> will say it all.

The problem is that CHECK conditions can contain just about anything,
hence the planner needs to deal with that possibility.

> Infact there you have not just the column names with relevant values, but the
> actual expression(s) to be checked,

Yes, that would be the problem.  Proving something based on expressions
is alot more time consuming and complicated than being explicitly told
what goes where.
Stephen

Re: On Scalability

From
Vincenzo Romano
Date:
2010/10/7 Greg Smith <greg@2ndquadrant.com>:
> Vincenzo Romano wrote:
>>
>> I see the main problem in the way the planner "understands" which
>> partition
>> is useful and which one is not.
>> Having the DDL supporting the feature could just be syntactic sugar
>> if the underlying mechanism is inadequate.
>>
>
> You have the order of this backwards.  In order to do better than the way
> the current scheme is implemented, the optimizer needs higher quality
> metadata about the structure of the partitions to work with.  Right now,
> it's inferring them from the CHECK constraints, which requires the whole
> theorem-proving bit Tom mentioned.  That's never going to get any more
> algorithmically efficient than it already is.
> If the DDL that created the partitions also made better quality metadata
> available about the structure of the partitions, at that point it would be
> possible to also do better in how the optimizer pruned partitions to
> consider too.  If the list it has was known to be in a particular
> structured/sorted order, the optimizer could do a binary search to find
> relevant partitions, rather than the linear scan required right now.


Do you mean the check constraint is used as plain text to be (somehow) executed?
If this is the case, then you (all) are perfectly and obviously right
and I'm just fishing
for bicycles in the sea.

I would expect a parser to ... ehm ... parse the CHECK constraint
expression at "CREATE TABLE " time and
extract all the needed "high quality metadata", like the list of
columns involved and the type of
checks (range, value list, etc.).
The same would be useful for partial indexes, as well.

But maybe this is just wishful thinking.

> Until that work is done, any other improvement attempts are doomed to fail.
>  That's the point Robert was trying to make to you.  And the fact Oracle
> does this is why it's able to scale to high partition counts better than
> PostgreSQL can.
>
> You can read more about the work that was being done here at
> http://wiki.postgresql.org/wiki/Table_partitioning

Done. As well as the official documentation.
The point is that there are no hints on the topic.
There should be a "caveat" in the documentation saying that partitioning
is not scalable. As well as partial indexing.

Thanks so far for the information.

> --
> Greg Smith, 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
> PostgreSQL Training, Services and Support  www.2ndQuadrant.us

--
Vincenzo Romano at NotOrAnd Information Technologies
Software Hardware Networking Training Support Security
--
NON QVIETIS MARIBVS NAVTA PERITVS


Re: On Scalability

From
Alvaro Herrera
Date:
Excerpts from Vincenzo Romano's message of jue oct 07 10:44:34 -0400 2010:

> Do you mean the check constraint is used as plain text to be (somehow) executed?
> If this is the case, then you (all) are perfectly and obviously right
> and I'm just fishing
> for bicycles in the sea.

Yeah, hence this thread hasn't advanced things very much in any useful
direction.  That we need to improve the partitioning implementation is
already known.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: On Scalability

From
Vincenzo Romano
Date:
2010/10/7 Alvaro Herrera <alvherre@commandprompt.com>:
> Excerpts from Vincenzo Romano's message of jue oct 07 10:44:34 -0400 2010:
>
>> Do you mean the check constraint is used as plain text to be (somehow) executed?
>> If this is the case, then you (all) are perfectly and obviously right
>> and I'm just fishing
>> for bicycles in the sea.
>
> Yeah, hence this thread hasn't advanced things very much in any useful
> direction.  That we need to improve the partitioning implementation is
> already known.

Maybe I'm willing to help and possibly able to.
But I need to understand things that are already known but I didn't know yet.

> --
> Álvaro Herrera <alvherre@commandprompt.com>
> The PostgreSQL Company - Command Prompt, Inc.
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support

--
Vincenzo Romano at NotOrAnd Information Technologies
Software Hardware Networking Training Support Security
--
NON QVIETIS MARIBVS NAVTA PERITVS


Re: On Scalability

From
Vincenzo Romano
Date:
2010/10/7 Stephen Frost <sfrost@snowman.net>:
> * Vincenzo Romano (vincenzo.romano@notorand.it) wrote:
>> Which kind of information are you thinking about?
>> I think that the stuff you put into the CHECK condition for the table
>> will say it all.
>
> The problem is that CHECK conditions can contain just about anything,
> hence the planner needs to deal with that possibility.

Not really. For partitioning there would be some constraints as you
have in the DEFAULT values.

>> Infact there you have not just the column names with relevant values, but the
>> actual expression(s) to be checked,
>
> Yes, that would be the problem.  Proving something based on expressions
> is alot more time consuming and complicated than being explicitly told
> what goes where.

Consuming computing resources at DDL-time should be OK if that will
lead to big savings at DML-time (run-time), my opinion. It'd be just like
compile time optimizations.

>        Stephen
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
>
> iEYEARECAAYFAkyt3qMACgkQrzgMPqB3kiih3wCcCwLlvpDCjgG5LSgim/XGieEE
> MsEAn0mHfAizDOpvepGXWTWlxHtJibA5
> =Szx4
> -----END PGP SIGNATURE-----
>

--
Vincenzo Romano at NotOrAnd Information Technologies
Software Hardware Networking Training Support Security
--
NON QVIETIS MARIBVS NAVTA PERITVS


Re: On Scalability

From
Stephen Frost
Date:
* Vincenzo Romano (vincenzo.romano@notorand.it) wrote:
> I would expect a parser to ... ehm ... parse the CHECK constraint
> expression at "CREATE TABLE " time and
> extract all the needed "high quality metadata", like the list of
> columns involved and the type of
> checks (range, value list, etc.).

Check constraints can be added after the table is created.  Inheiritance
can be added/changed independently of check constraints.  Hacking all of
the inheiritance, check constraint creation, and any other possibly
involved code paths to try to figure out if this particular table, check
constraint, inheiritance relationship, etc, is part of a partitioning
setup isn't exactly trivial, or the right approach.
Thanks,
    Stephen

Re: On Scalability

From
Stephen Frost
Date:
* Vincenzo Romano (vincenzo.romano@notorand.it) wrote:
> 2010/10/7 Stephen Frost <sfrost@snowman.net>:
> > * Vincenzo Romano (vincenzo.romano@notorand.it) wrote:
> > The problem is that CHECK conditions can contain just about anything,
> > hence the planner needs to deal with that possibility.
>
> Not really. For partitioning there would be some constraints as you
> have in the DEFAULT values.

How do we know when it's partitioning and not a CHECK constraint being
used for something else..?  I'll tell you- through the user using
specific partitioning DDL statements.

> Consuming computing resources at DDL-time should be OK if that will
> lead to big savings at DML-time (run-time), my opinion. It'd be just like
> compile time optimizations.

CHECK constraints, inheiritance, etc, are general things which can be
used for more than just partitioning.  Abusing them to go through tons
of extra gyrations to make the specific partitioning case faster at DML
time (if that's really even possible...  I'm not convinced you could
make it bullet-proof) isn't a good approach.
Thanks,
    Stephen

Re: On Scalability

From
Vincenzo Romano
Date:
2010/10/7 Stephen Frost <sfrost@snowman.net>:
> * Vincenzo Romano (vincenzo.romano@notorand.it) wrote:
>> 2010/10/7 Stephen Frost <sfrost@snowman.net>:
>> > * Vincenzo Romano (vincenzo.romano@notorand
.it) wrote:
>> > The problem is that CHECK conditions can contain just about anything,
>> > hence the planner needs to deal with that possibility.
>>
>> Not really. For partitioning there would be some constraints as you
>> have in the DEFAULT values.
>
> How do we know when it's partitioning and not a CHECK constraint being
> used for something else..?

Why asking? You don't need to tell them apart.
"Just" parse the expression, extract the metadata to be used when the expression
need to be evaluated. Being it a "plain" CHECK constraint or something
for the partition
management would then be irrelevant.

> I'll tell you- through the user using
> specific partitioning DDL statements.

That could be the next step, once  the underlying stuff is already in place.

>> Consuming computing resources at DDL-time should be OK if that will
>> lead to big savings at DML-time (run-time), my opinion. It'd be just like
>> compile time optimizations.
>
> CHECK constraints, inheiritance, etc, are general things which can be
> used for more than just partitioning.  Abusing them to go through tons
> of extra gyrations to make the specific partitioning case faster at DML
> time (if that's really even possible...  I'm not convinced you could
> make it bullet-proof) isn't a good approach.

At the moment I'm not interested in particular cases.
I think that CHECK constraints (as well as partial indexes expressions) should
be handled in a more effective way. Better partitioning (both for
tables and indexes) would
be a side effect.

Thanks for the insights.

>        Thanks,
>
>                Stephen
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
>
> iEYEARECAAYFAkyt5CAACgkQrzgMPqB3kijAUACfd9QcB00Nic6mSwWmwoXABc4p
> kBoAnAijF39ZTFOGjpk1CN/8/I3Tj9HI
> =C8G/
> -----END PGP SIGNATURE-----


--
Vincenzo Romano at NotOrAnd Information Technologies
Software Hardware Networking Training Support Security
--
NON QVIETIS MARIBVS NAVTA PERITVS


Re: On Scalability

From
Vincenzo Romano
Date:
2010/10/7 Stephen Frost <sfrost@snowman.net>:
> * Vincenzo Romano (vincenzo.romano@notorand.it) wrote:
>> I would expect a parser to ... ehm ... parse the CHECK constraint
>> expression at "CREATE TABLE " time and
>> extract all the needed "high quality metadata", like the list of
>> columns involved and the type of
>> checks (range, value list, etc.).
>
> Check constraints can be added after the table is created.  Inheiritance
> can be added/changed independently of check constraints.  Hacking all of
> the inheiritance, check constraint creation, and any other possibly
> involved code paths to try to figure out if this particular table, check
> constraint, inheiritance relationship, etc, is part of a partitioning
> setup isn't exactly trivial, or the right approach.
>
>        Thanks,
>
>                Stephen

I think none will say things are trivial.
So, what'd be the right approach in your vision?
I mean, if you think about partitioning a-la Oracle, then you'll have to
parse those expressions anyway.

--
Vincenzo Romano at NotOrAnd Information Technologies
Software Hardware Networking Training Support Security
--
cel +393398083886 fix +390823454163 fax +3902700506964
gtalk. vincenzo.romano@notorand.it skype. notorand.it
--
NON QVIETIS MARIBVS NAVTA PERITVS


Re: On Scalability

From
"Kevin Grittner"
Date:
Vincenzo Romano <vincenzo.romano@notorand.it> wrote:
> 2010/10/7 Stephen Frost <sfrost@snowman.net>:
>> Yes, that would be the problem.  Proving something based on
>> expressions is alot more time consuming and complicated than
>> being explicitly told what goes where.
> 
> Consuming computing resources at DDL-time should be OK if that
> will lead to big savings at DML-time (run-time), my opinion. It'd
> be just like compile time optimizations.
I think something you haven't entirely grasped is how pluggable
PostgreSQL is -- you can not only define your own functions in a
wide variety of languages (including C), but your own data types,
operators, casts, index strategies, etc.  Determining, even at DDL
time that even a built-in datatype's expression is or isn't useful
in partitioning could be quite painful in the absence of syntax
specifically geared toward partitioning.  If there's a CHECK
constraint on a polygon column to ensure that it isn't a concave
polygon, you might be facing a lot of work to know whether it's
involved in partitioning.  Now imagine that a CHECK constraint is on
a column with a user defined type and uses the @%!! operator and
that the user has changed some of the allowed implicit casts used in
the expression.
While this flexibility is a great strength of PostgreSQL, it makes
some things more difficult to implement than they would be in more
limited database products.
-Kevin


Re: On Scalability

From
Stephen Frost
Date:
* Vincenzo Romano (vincenzo.romano@notorand.it) wrote:
> So, what'd be the right approach in your vision?

Have you read http://wiki.postgresql.org/wiki/Table_partitioning and the
various places it links to..?

> I mean, if you think about partitioning a-la Oracle, then you'll have to
> parse those expressions anyway.

Oracle's approach is discussed there.
Thanks,
    Stephen

Re: On Scalability

From
Vincenzo Romano
Date:
2010/10/7 Stephen Frost <sfrost@snowman.net>:
> * Vincenzo Romano (vincenzo.romano@notorand.it) wrote:
>> So, what'd be the right approach in your vision?
>
> Have you read http://wiki.postgresql.org/wiki/Table_partitioning and the
> various places it links to..?
>
>> I mean, if you think about partitioning a-la Oracle, then you'll have to
>> parse those expressions anyway.
>
> Oracle's approach is discussed there.

I didn't meant the implementation, but the goals achieved.

>
>        Thanks,
>
>                Stephen
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
>
> iEYEARECAAYFAkyt6ikACgkQrzgMPqB3kih0HwCcD8rQQhD6oXao8ZnG/bMAvx2d
> 4HkAnjrzox4XemzVyFkhKRXb3ZjS2nba
> =6WlP
> -----END PGP SIGNATURE-----
>
>



--
Vincenzo Romano at NotOrAnd Information Technologies
Software Hardware Networking Training Support Security
--
NON QVIETIS MARIBVS NAVTA PERITVS


Re: On Scalability

From
Greg Stark
Date:
Firstly I want to say I think this discussion is over-looking some
benefits of the current system in other use cases. I don't think we
should get rid of the current system even once we have "proper"
partitioning. It solves use cases such as data warehouse queries that
need to do a full table scan of some subset of the data which happens
to be located in a single sub-table quite well. In that case being
able to do a sequential scan instead of an index range scan is a big
benefit and the overhead of the analysis is irrelevant for a data
warehouse query. And the constraint may or may not have anything to do
with the partitioning key. You cold have constraints like "customer_id
in (...)" for last month's financial records so lookups for new
customers don't need to check all the historical tables from before
they became customers.

In fact what I'm interested in doing is extending the support to use
stats on children marked read-only. If we have a histogram for a table
which has been marked read-only since the table was analyzed then we
could trust the upper and lower bounds or the most-frequent-list to
exclude partitions. That would really help for things like date-range
lookups on tables where the partition key is "financial quarter" or
"invoice_id" or some other nearly perfectly correlated column.

None of this replaces having a good partitioning story for OLTP
queries and management needs. But it extends the usefulness of that
setup to data warehouse queries on other related columns that haven't
been explicitly declared as the partitioning key.


On Thu, Oct 7, 2010 at 8:35 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> Vincenzo Romano <vincenzo.romano@notorand.it> wrote:
>> 2010/10/7 Stephen Frost <sfrost@snowman.net>:
>
>>> Yes, that would be the problem.  Proving something based on
>>> expressions is alot more time consuming and complicated than
>>> being explicitly told what goes where.
>>
>> Consuming computing resources at DDL-time should be OK if that
>> will lead to big savings at DML-time (run-time), my opinion. It'd
>> be just like compile time optimizations.
>
> I think something you haven't entirely grasped is how pluggable
> PostgreSQL is -- you can not only define your own functions in a
> wide variety of languages (including C), but your own data types,
> operators, casts, index strategies, etc.

I suspect it's likely that a partitioning system would only work with
btree opclasses anyways. It might be interesting to think about what
it would take to make the setups we've talked about in the past work
with arbitrary operator classes as long as those operator classes
support some concept of "mutually exclusive". But nothing we've talked
about so far would be that flexible.

Pre-analyzing the check constraints to construct a partitioning data
structure might even be a plausible way to move forward -- I don't see
any obvious show-stoppers. The system could look for a set of btree
opclass based conditions that guarantee all the partitions are
mutually exclusive ranges.

My instincts tell me it would be less useful though because there's
less the system would be able to do with that structure to help the
user. That is, if it *can't* prove the constraints are mutually
exclusive then the user is left with a bunch of check constraints and
no useful feedback about what they've done wrong. And if it can prove
it the user is happy but the next time he has to add a partition he
has to look at the existing partitions and carefully construct his
check constraint instead of having the system help him out by
supplying one side of the bounds and providing a convenient syntax. It
would also be hard to specify how to automatically add partitions
which I expect is a feature people will want eventually.

There are some plus sides as well -- allowing some optimizations for
check constraints without requiring the user to promise to always use
that as their partitioning key in the future. But I think on the whole
it would be a disadvantage.


--
greg


Re: On Scalability

From
Simon Riggs
Date:
On Thu, 2010-10-07 at 14:10 +0200, Vincenzo Romano wrote:

> Making these things sub-linear (whether not O(log n) or even O(1) ),
> provided that there's  way to, would make this RDBMS more appealing
> to enterprises.
> I mean also partial indexes (as an alternative to table partitioning).
> Being able to effectively cope with "a dozen child tables or so" it's more
> like an amateur feature.
> If you really need partitioning (or just hierarchical stuff) I think you'll need
> for quite more than a dozen items.
> If you partition by just weeks, you'll need 50+ a year.
> 
> Is there any precise direction to where look into the code for it?
> 
> Is there a way to put this into a wish list?

It's already on the wish list ("TODO") and has been for many years.

We've mostly lacked somebody with the experience and time/funding to
complete that implementation work. I figure I'll be doing it for 9.2
now; it may be difficult to do this for next release.

Theoretically, this can be O(n.log n) for range partitioning and O(1)
for exact value partitioning, though the latter isn't a frequent use
case.

Your conclusion that the current partitioning only works with a dozen or
so items doesn't match the experience of current users however.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services



Re: On Scalability

From
Vincenzo Romano
Date:
2010/10/7 Simon Riggs <simon@2ndquadrant.com>:
> On Thu, 2010-10-07 at 14:10 +0200, Vincenzo Romano wrote:
>
>> Making these things sub-linear (whether not O(log n) or even O(1) ),
>> provided that there's  way to, would make this RDBMS more appealing
>> to enterprises.
>> I mean also partial indexes (as an alternative to table partitioning).
>> Being able to effectively cope with "a dozen child tables or so" it's more
>> like an amateur feature.
>> If you really need partitioning (or just hierarchical stuff) I think you'll need
>> for quite more than a dozen items.
>> If you partition by just weeks, you'll need 50+ a year.
>>
>> Is there any precise direction to where look into the code for it?
>>
>> Is there a way to put this into a wish list?
>
> It's already on the wish list ("TODO") and has been for many years.
>
> We've mostly lacked somebody with the experience and time/funding to
> complete that implementation work. I figure I'll be doing it for 9.2
> now; it may be difficult to do this for next release.
>
> Theoretically, this can be O(n.log n) for range partitioning and O(1)
> for exact value partitioning, though the latter isn't a frequent use
> case.

O(n*log n) is what I would expect from a good algorithm.

> Your conclusion that the current partitioning only works with a dozen or
> so items doesn't match the experience of current users however.

People on the mailing lists says so.
I think I'm forced now to plan for tests on our side, despite this is
not what I'd
like to do with the "most advanced open source database".
I'll publish the results on my blog, anyway.

> --
>  Simon Riggs           www.2ndQuadrant.com
>  PostgreSQL Development, 24x7 Support, Training and Services
>
>



--
Vincenzo Romano at NotOrAnd Information Technologies
Software Hardware Networking Training Support Security
--
cel +393398083886 fix +390823454163 fax +3902700506964
gtalk. vincenzo.romano@notorand.it skype. notorand.it
--
NON QVIETIS MARIBVS NAVTA PERITVS


Re: On Scalability

From
Vincenzo Romano
Date:
2010/10/7 Simon Riggs <simon@2ndquadrant.com>:
> On Thu, 2010-10-07 at 14:10 +0200, Vincenzo Romano wrote:
>
>> Making these things sub-linear (whether not O(log n) or even O(1) ),
>> provided that there's  way to, would make this RDBMS more appealing
>> to enterprises.
>> I mean also partial indexes (as an alternative to table partitioning).
>> Being able to effectively cope with "a dozen child tables or so" it's more
>> like an amateur feature.
>> If you really need partitioning (or just hierarchical stuff) I think you'll need
>> for quite more than a dozen items.
>> If you partition by just weeks, you'll need 50+ a year.
>>
>> Is there any precise direction to where look into the code for it?
>>
>> Is there a way to put this into a wish list?
>
> It's already on the wish list ("TODO") and has been for many years.
>
> We've mostly lacked somebody with the experience and time/funding to
> complete that implementation work. I figure I'll be doing it for 9.2
> now; it may be difficult to do this for next release.
>
> Theoretically, this can be O(n.log n) for range partitioning and O(1)
> for exact value partitioning, though the latter isn't a frequent use
> case.
>
> Your conclusion that the current partitioning only works with a dozen or
> so items doesn't match the experience of current users however.
>
> --
>  Simon Riggs           www.2ndQuadrant.com
>  PostgreSQL Development, 24x7 Support, Training and Services
>

Do the same conclusions apply to partial indexes?
I mean, if I have a large number (n>=100 or n>=1000) of partial indexes
on a single very large table (m>=10**12), how good is the planner to choose the
right indexes to plan a query?
Has also this algorithm superlinear complexity?


--
Vincenzo Romano at NotOrAnd Information Technologies
Software Hardware Networking Training Support Security
--
NON QVIETIS MARIBVS NAVTA PERITVS


Re: On Scalability

From
Greg Stark
Date:
On Fri, Oct 8, 2010 at 3:20 AM, Vincenzo Romano
<vincenzo.romano@notorand.it> wrote:
> Do the same conclusions apply to partial indexes?
> I mean, if I have a large number (n>=100 or n>=1000) of partial indexes
> on a single very large table (m>=10**12), how good is the planner to choose the
> right indexes to plan a query?
> Has also this algorithm superlinear complexity?

No, it's also linear. It needs to look at every partial index and
check to see whether it's a candidate for your query. Actually that's
true for regular indexes as well but it has the extra step of proving
that the partial index includes all the rows your query needs which is
not a cheap step.

The size of the table isn't relevant though, except inasmuch as the
savings when actually running the query will be larger for larger
tables so it may be worth spending more time planning queries on large
tables.

-- 
greg