Re: Dynamic Partitioning using Segment Visibility Maps - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: Dynamic Partitioning using Segment Visibility Maps |
Date | |
Msg-id | 1199454003.18598.87.camel@ebony.site Whole thread Raw |
In response to | Re: Dynamic Partitioning using Segment Visibility Maps (Markus Schiltknecht <markus@bluegap.ch>) |
List | pgsql-hackers |
On Fri, 2008-01-04 at 13:29 +0100, Markus Schiltknecht wrote: > Given that we are operating on segments here, to which the DBA has very > limited information and access, I prefer the term "Segment Exclusion". I > think of that as an optimization of sequential scans on tables with the > above mentioned characteristics. > > > If we do need to differentiate between the two proposals, we can refer > > to this one as the Segment Visibility Map (SVM). > > I'm clearly in favor of separating between the two proposals. SVM is a > good name, IMHO. OK, I'll refer to this as proposal as SVM. > > There would be additional complexity in selectivity estimation and plan > > costing. The above proposal allows dynamic segment exclusion, which > > cannot be assessed at planning time anyway, so suggestions welcome... > > Hm.. that looks like a rather bad downside of an executor-only optimization. I think that's generally true. We already have that problem with planned statements and work_mem, for example, and parameterised query planning is a difficult problem. Stable functions are already estimated at plan time, so we hopefully should be getting that right. I don't see any show stoppers here, just more of the usual problems of query optimization. > > Comparison with other Partitioning approaches > > --------------------------------------------- > > > > Once I realised this was possible in fairly automatic way, I've tried > > hard to keep away from manual overrides, commands and new DDL. > > > > Declarative partitioning is a big overhead, though worth it for large > > databases. No overhead is *much* better though. > > > > This approach to partitioning solves the following challenges > > - allows automated table extension, so works automatically with Slony > > - responds dynamically to changing data > > - allows stable functions, nested loop joins and parametrised queries > > - allows RI via SHARE locks > > - avoids the need for operator push-down through Append nodes > > - allows unique indexes > > - allows both global indexes (because we only have one table) > > - allows advanced planning/execution using read-only/visible data > > - works naturally with synchronous scans and buffer recycling > > > > All of the above are going to take considerably longer to do in any of > > the other ways I've come up with so far... > > I fully agree. But as I tried to point out above, the gains in > manageability from Segment Exclusion are also pretty close to zero. So > I'd argue they only fulfill parts of the needs for general horizontal > partitioning. Agreed. My focus for this proposal wasn't manageability, as it had been in other recent proposals. I think there are some manageability wins to be had as well, but we need to decide what sort of partitioning we want/need first. So in the case of SVM, enhanced manageability is really a phase 2 thing. Plus, you can always combine a design with constraint and segment exclusion. > Maybe a combination with CLUSTERing would be worthwhile? Or even > enforced CLUSTERing for the older segments? I think there's merit in Heikki's maintain cluster order patch and that should do an even better job of maintaining locality. Thanks for detailed comments. I'll do my best to include all of the viewpoints you've expressed as the design progresses. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
pgsql-hackers by date: