Thread: Partitioning such that key field of inherited tables no longer retains any selectivity
Partitioning such that key field of inherited tables no longer retains any selectivity
Re: Partitioning such that key field of inherited tables no longer retains any selectivity
Tim Kane wrote > The subject line may not actually describe what I want to illustrate… > > Basically, let’s say we have a nicely partitioned data-set. Performance is > a > net win and I’m happy with it. > The partitioning scheme is equality based, rather than range based. > > That is, each partition contains a subset of the data where partition_key > = > {some_value}, and of course we let constraint exclusion enable the > optimiser > to do its thing. > > As such, all of the data contained in a given partition has the same value > for partition_key. That field, within the scope of its partition – isn’t > terribly useful anymore, and in my mind is wasting bytes – it’s only > purpose > really is to allow the CHECK constraint to verify the data is what it > should > be. > > > Wouldn’t it be nice if we could somehow create a child table where we > could > define a const field value, that did not need to be stored on disk at the > tuple level? > This would allow the check constraint to supply the optimiser with the > information it needs, while removing the need to consume disk to record a > field whose value is always the same. > > > Extending this idea.. > Postgresql could possibly look at any equality based check constraint for > a > table and instead of storing each field value verbatim, we could > implicitly > optimise away the need to write those field values to disk, on the > understanding that those values can never change (unless the constraint is > removed/altered). > > I’m sure there are all kinds of worms in this canister, but I thought it > might be an interesting discussion. > > > Cheers, > > Tim Two approaches: 1. Standard virtual column name that, when used, gets rewritten into a constant that is stored at the table level. 2. A way for a column's value to be defined as a function call. Option 2 has the virtue of being more generally applicable but you'd need some way to know that for any give table that a given function resolves to a constant. Maybe have a magic function like partitonid(tabloid) that if used in a query would be interpreted in this way. Combined with option 1 and the stand column could be pre-defined in this way - if the partition constant exists which is the main thing to avoid - increased checking/rewriting time for non-partitioned tables. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Partitioning-such-that-key-field-of-inherited-tables-no-longer-retains-any-selectivity-tp5803549p5803561.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Re: Re: Partitioning such that key field of inherited tables no longer retains any selectivity
David G Johnston <david.g.johnston@gmail.com> writes: > Two approaches: > 1. Standard virtual column name that, when used, gets rewritten into a > constant that is stored at the table level. > 2. A way for a column's value to be defined as a function call. Recent versions of the SQL spec have a notion of "generated columns" that I think subsumes both of these concepts. We had a draft patch awhile back that attempted to implement that feature. It crashed and burned for reasons I don't recall ... but certainly implementing an already-standardized feature is more attractive than just inventing behavior on our own. regards, tom lane
Re: Re: Partitioning such that key field of inherited tables no longer retains any selectivity
David G Johnston <david.g.johnston@gmail.com> writes:Two approaches:1. Standard virtual column name that, when used, gets rewritten into aconstant that is stored at the table level.2. A way for a column's value to be defined as a function call.Recent versions of the SQL spec have a notion of "generated columns"that I think subsumes both of these concepts. We had a draft patchawhile back that attempted to implement that feature. It crashedand burned for reasons I don't recall ... but certainly implementingan already-standardized feature is more attractive than just inventingbehavior on our own.
4.14.8 Base columns and generated columns
A column of a base table is either a base column or a generated column. A base column is one that is not a generated column. A generated column is one whose values are determined by evaluation of a generation expression, a <value expression> whose declared type is by implication that of the column. A generation expression can reference base columns of the base table to which it belongs but cannot otherwise access SQL- data. Thus, the value of the field corresponding to a generated column in row R is determined by the values of zero or more other fields of R.
A generated column GC depends on each column that is referenced by a <column reference> in its generation expression, and each such referenced column is a parametric column of GC.
—————————————————
Re: Partitioning such that key field of inherited tables no longer retains any selectivity
David G Johnston <[hidden email]> writes:Two approaches:1. Standard virtual column name that, when used, gets rewritten into aconstant that is stored at the table level.2. A way for a column's value to be defined as a function call.Recent versions of the SQL spec have a notion of "generated columns"that I think subsumes both of these concepts. We had a draft patchawhile back that attempted to implement that feature. It crashedand burned for reasons I don't recall ... but certainly implementingan already-standardized feature is more attractive than just inventingbehavior on our own.That sounds interesting.Is this what you are referring to? Actually, it looks like it would fit the bill and then some.—————————————————4.14.8 Base columns and generated columns
A column of a base table is either a base column or a generated column. A base column is one that is not a generated column. A generated column is one whose values are determined by evaluation of a generation expression, a <value expression> whose declared type is by implication that of the column. A generation expression can reference base columns of the base table to which it belongs but cannot otherwise access SQL- data. Thus, the value of the field corresponding to a generated column in row R is determined by the values of zero or more other fields of R.
A generated column GC depends on each column that is referenced by a <column reference> in its generation expression, and each such referenced column is a parametric column of GC.
—————————————————
View this message in context: Re: Partitioning such that key field of inherited tables no longer retains any selectivity
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Re: Re: Partitioning such that key field of inherited tables no longer retains any selectivity
[------------------]
This is basically what I intended to describe in "option 2"...without the benefit of ever having really read the SQL standard.So the planner would have to know that, for a given table, the generation expression results in a constant - would likely in fact have to be a constant expression like, assuming a non-number value, ='column_value', where the "=" sign indicates that this is a generation expression and not a stored value (like default behaves currently).
wouldn't it be ways better, if the constraints for partitioning by inharitance were set at the "master" table, instead of the way it's currently done at the inharited tables (as exclusive CHECK-s there)?
I mean a constraint like a "function(table columns) reutrning table_name or tablespace_name of the actual target table"?
<start preudocode>
create table master (a int, b int, c int);
create table table_a (inharits master);
create table table_b (inharits master);
create function(a,b) returns text as $$ if a > b then return "table_a" else return "table_b"; end if; end $$
... or:
create function(a,b) returns tablespace as $$ if a > b then return tablespace("table_a") else return tablespace("table_b"); end if; end $$
alter table master add constraint "partitioning" check/select/route function(a,b);
<end pseudocode>
-R