Re: Dynamic Partitioning using Segment Visibility Maps - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Dynamic Partitioning using Segment Visibility Maps
Date
Msg-id 1199454003.18598.87.camel@ebony.site
Whole thread Raw
In response to Re: Dynamic Partitioning using Segment Visibility Maps  (Markus Schiltknecht <markus@bluegap.ch>)
List pgsql-hackers
On Fri, 2008-01-04 at 13:29 +0100, Markus Schiltknecht wrote:

> Given that we are operating on segments here, to which the DBA has very 
> limited information and access, I prefer the term "Segment Exclusion". I 
> think of that as an optimization of sequential scans on tables with the 
> above mentioned characteristics.
> 
> > If we do need to differentiate between the two proposals, we can refer
> > to this one as the Segment Visibility Map (SVM).
> 
> I'm clearly in favor of separating between the two proposals. SVM is a 
> good name, IMHO.

OK, I'll refer to this as proposal as SVM.

> > There would be additional complexity in selectivity estimation and plan
> > costing. The above proposal allows dynamic segment exclusion, which
> > cannot be assessed at planning time anyway, so suggestions welcome...
> 
> Hm.. that looks like a rather bad downside of an executor-only optimization.

I think that's generally true. We already have that problem with planned
statements and work_mem, for example, and parameterised query planning
is a difficult problem. Stable functions are already estimated at plan
time, so we hopefully should be getting that right. I don't see any show
stoppers here, just more of the usual problems of query optimization.

> > Comparison with other Partitioning approaches
> > ---------------------------------------------
> > 
> > Once I realised this was possible in fairly automatic way, I've tried
> > hard to keep away from manual overrides, commands and new DDL.
> > 
> > Declarative partitioning is a big overhead, though worth it for large
> > databases. No overhead is *much* better though.
> > 
> > This approach to partitioning solves the following challenges
> > - allows automated table extension, so works automatically with Slony
> > - responds dynamically to changing data
> > - allows stable functions, nested loop joins and parametrised queries
> > - allows RI via SHARE locks
> > - avoids the need for operator push-down through Append nodes
> > - allows unique indexes
> > - allows both global indexes (because we only have one table)
> > - allows advanced planning/execution using read-only/visible data
> > - works naturally with synchronous scans and buffer recycling
> > 
> > All of the above are going to take considerably longer to do in any of
> > the other ways I've come up with so far... 
> 
> I fully agree. But as I tried to point out above, the gains in 
> manageability from Segment Exclusion are also pretty close to zero. So 
> I'd argue they only fulfill parts of the needs for general horizontal 
> partitioning.

Agreed.

My focus for this proposal wasn't manageability, as it had been in other
recent proposals. I think there are some manageability wins to be had as
well, but we need to decide what sort of partitioning we want/need
first. 

So in the case of SVM, enhanced manageability is really a phase 2 thing.

Plus, you can always combine a design with constraint and segment
exclusion.

> Maybe a combination with CLUSTERing would be worthwhile? Or even 
> enforced CLUSTERing for the older segments?

I think there's merit in Heikki's maintain cluster order patch and that
should do an even better job of maintaining locality.

Thanks for detailed comments. I'll do my best to include all of the
viewpoints you've expressed as the design progresses.

--  Simon Riggs 2ndQuadrant  http://www.2ndQuadrant.com



pgsql-hackers by date:

Previous
From: Glyn Astill
Date:
Subject: Problem with PgTcl auditing function on trigger
Next
From: Peter Eisentraut
Date:
Subject: SSL over Unix-domain sockets