Thread: Add protransform for numeric, varbit, and temporal types
Building on commit 8f9fe6edce358f7904e0db119416b4d1080a83aa, this adds protransform functions to the length coercions for numeric, varbit, timestamp, timestamptz, time, timetz and interval. This mostly serves to make more ALTER TABLE ALTER TYPE operations avoid a rewrite, including numeric(10,2) -> numeric(12,2), varbit(4) -> varbit(8) and timestamptz(2) -> timestamptz(4). The rules for varbit are exactly the same as for varchar. Numeric is slightly more complex: * Flatten calls to our length coercion function that solely represent * increases in allowable precision. Scale changes mutate every datum, so * they are unoptimizable. Some values, e.g. 1E-1001, can only fit into an * unconstrained numeric, so a change from an unconstrained numeric to any * constrained numeric is also unoptimizable. time{,stamp}{,tz} are similar to varchar for these purposes, except that, for example, plain "timestamptz" is equivalent to "timestamptz(6)". interval has a vastly different typmod format, but the principles applicable to length coercion remain the same. Under --disable-integer-datetimes, I'm not positive that timestamp_scale() is always a no-op when one would logically expect as much. Does there exist a timestamp such that v::timestamp(2) differs from v:timestamp(2)::timestamp(4) due to floating point rounding? Even if so, I'm fairly comfortable calling it a feature rather than a bug to avoid perturbing values that way. After these patches, the only core length coercion casts not having protransform functions are those for "bpchar" and "bit". For those, we could only optimize trivial cases of no length change. I'm not planning to do so. Thanks, nm
Attachment
On Sat, Dec 31, 2011 at 7:36 PM, Noah Misch <noah@leadboat.com> wrote: > Building on commit 8f9fe6edce358f7904e0db119416b4d1080a83aa, this adds > protransform functions to the length coercions for numeric, varbit, timestamp, > timestamptz, time, timetz and interval. This mostly serves to make more ALTER > TABLE ALTER TYPE operations avoid a rewrite, including numeric(10,2) -> > numeric(12,2), varbit(4) -> varbit(8) and timestamptz(2) -> timestamptz(4). > The rules for varbit are exactly the same as for varchar. Numeric is slightly > more complex: > > * Flatten calls to our length coercion function that solely represent > * increases in allowable precision. Scale changes mutate every datum, so > * they are unoptimizable. Some values, e.g. 1E-1001, can only fit into an > * unconstrained numeric, so a change from an unconstrained numeric to any > * constrained numeric is also unoptimizable. > > time{,stamp}{,tz} are similar to varchar for these purposes, except that, for > example, plain "timestamptz" is equivalent to "timestamptz(6)". interval has > a vastly different typmod format, but the principles applicable to length > coercion remain the same. > > Under --disable-integer-datetimes, I'm not positive that timestamp_scale() is > always a no-op when one would logically expect as much. Does there exist a > timestamp such that v::timestamp(2) differs from v:timestamp(2)::timestamp(4) > due to floating point rounding? Even if so, I'm fairly comfortable calling it > a feature rather than a bug to avoid perturbing values that way. > > After these patches, the only core length coercion casts not having > protransform functions are those for "bpchar" and "bit". For those, we could > only optimize trivial cases of no length change. I'm not planning to do so. This is cool stuff. I will plan to review this once the CF starts. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Jan 3, 2012 at 10:31 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Sat, Dec 31, 2011 at 7:36 PM, Noah Misch <noah@leadboat.com> wrote: >> Building on commit 8f9fe6edce358f7904e0db119416b4d1080a83aa, this adds >> protransform functions to the length coercions for numeric, varbit, timestamp, >> timestamptz, time, timetz and interval. This mostly serves to make more ALTER >> TABLE ALTER TYPE operations avoid a rewrite, including numeric(10,2) -> >> numeric(12,2), varbit(4) -> varbit(8) and timestamptz(2) -> timestamptz(4). >> The rules for varbit are exactly the same as for varchar. Numeric is slightly >> more complex: >> >> * Flatten calls to our length coercion function that solely represent >> * increases in allowable precision. Scale changes mutate every datum, so >> * they are unoptimizable. Some values, e.g. 1E-1001, can only fit into an >> * unconstrained numeric, so a change from an unconstrained numeric to any >> * constrained numeric is also unoptimizable. >> >> time{,stamp}{,tz} are similar to varchar for these purposes, except that, for >> example, plain "timestamptz" is equivalent to "timestamptz(6)". interval has >> a vastly different typmod format, but the principles applicable to length >> coercion remain the same. >> >> Under --disable-integer-datetimes, I'm not positive that timestamp_scale() is >> always a no-op when one would logically expect as much. Does there exist a >> timestamp such that v::timestamp(2) differs from v:timestamp(2)::timestamp(4) >> due to floating point rounding? Even if so, I'm fairly comfortable calling it >> a feature rather than a bug to avoid perturbing values that way. >> >> After these patches, the only core length coercion casts not having >> protransform functions are those for "bpchar" and "bit". For those, we could >> only optimize trivial cases of no length change. I'm not planning to do so. > > This is cool stuff. I will plan to review this once the CF starts. I've committed the numeric and varbit patches and will look at the temporal one next. I did notice one odd thing, though: sometimes we seem to get a rebuild on the toast index for no obvious reason: rhaas=# set client_min_messages=debug1; SET rhaas=# create table foo (a serial primary key, b varbit); NOTICE: CREATE TABLE will create implicit sequence "foo_a_seq" for serial column "foo.a" DEBUG: building index "pg_toast_16430_index" on table "pg_toast_16430" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "foo_pkey" for table "foo" DEBUG: building index "foo_pkey" on table "foo" CREATE TABLE rhaas=# alter table foo alter column b set data type varbit(4); DEBUG: rewriting table "foo" DEBUG: building index "foo_pkey" on table "foo" ALTER TABLE rhaas=# alter table foo alter column b set data type varbit; DEBUG: building index "pg_toast_16430_index" on table "pg_toast_16430" ALTER TABLE Strangely, it doesn't happen if I add another column to the table: rhaas=# set client_min_messages=debug1; SET rhaas=# create table foo (a serial primary key, b varbit, c varbit); NOTICE: CREATE TABLE will create implicit sequence "foo_a_seq" for serial column "foo.a" DEBUG: building index "pg_toast_16481_index" on table "pg_toast_16481" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "foo_pkey" for table "foo" DEBUG: building index "foo_pkey" on table "foo" CREATE TABLE rhaas=# alter table foo alter column b set data type varbit(4); DEBUG: building index "pg_toast_16490_index" on table "pg_toast_16490" DEBUG: rewriting table "foo" DEBUG: building index "foo_pkey" on table "foo" ALTER TABLE rhaas=# alter table foo alter column b set data type varbit; ALTER TABLE There may not be any particular harm in a useless rebuild of the TOAST table index (wasted effort excepted), but my lack of understanding of why this is happening causes me to fear that there's a bug, not so much in these patches as in the core alter table code that is enabling skipping table and index rebuilds. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Feb 07, 2012 at 12:43:11PM -0500, Robert Haas wrote: > I've committed the numeric and varbit patches and will look at the > temporal one next. Thanks. The comment you added to numeric_transform() has a few typos, "constained" -> "constrained" and "nodes" -> "notes". > I did notice one odd thing, though: sometimes we > seem to get a rebuild on the toast index for no obvious reason: > > rhaas=# set client_min_messages=debug1; > SET > rhaas=# create table foo (a serial primary key, b varbit); > NOTICE: CREATE TABLE will create implicit sequence "foo_a_seq" for > serial column "foo.a" > DEBUG: building index "pg_toast_16430_index" on table "pg_toast_16430" > NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index > "foo_pkey" for table "foo" > DEBUG: building index "foo_pkey" on table "foo" > CREATE TABLE > rhaas=# alter table foo alter column b set data type varbit(4); > DEBUG: rewriting table "foo" > DEBUG: building index "foo_pkey" on table "foo" > ALTER TABLE When we rebuild the table at this point, it has a small maximum tuple size. Therefore, we omit the toast table entirely. > rhaas=# alter table foo alter column b set data type varbit; > DEBUG: building index "pg_toast_16430_index" on table "pg_toast_16430" > ALTER TABLE This command makes the tuple size unbounded again, so we create a toast table. > Strangely, it doesn't happen if I add another column to the table: > > rhaas=# set client_min_messages=debug1; > SET > rhaas=# create table foo (a serial primary key, b varbit, c varbit); With the extra unconstrained varbit column, the tuple size is continuously unbounded and the table has a toast table at all stages. So, the difference in behavior is correct, albeit unintuitive. > NOTICE: CREATE TABLE will create implicit sequence "foo_a_seq" for > serial column "foo.a" > DEBUG: building index "pg_toast_16481_index" on table "pg_toast_16481" > NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index > "foo_pkey" for table "foo" > DEBUG: building index "foo_pkey" on table "foo" > CREATE TABLE > rhaas=# alter table foo alter column b set data type varbit(4); > DEBUG: building index "pg_toast_16490_index" on table "pg_toast_16490" > DEBUG: rewriting table "foo" > DEBUG: building index "foo_pkey" on table "foo" > ALTER TABLE > rhaas=# alter table foo alter column b set data type varbit; > ALTER TABLE
On Tue, Feb 7, 2012 at 8:45 PM, Noah Misch <noah@leadboat.com> wrote: > On Tue, Feb 07, 2012 at 12:43:11PM -0500, Robert Haas wrote: >> I've committed the numeric and varbit patches and will look at the >> temporal one next. > > Thanks. The comment you added to numeric_transform() has a few typos, > "constained" -> "constrained" and "nodes" -> "notes". Yuck, that's pretty bad. Thanks, fixed (I hope). > When we rebuild the table at this point, it has a small maximum tuple size. > Therefore, we omit the toast table entirely. Ah, OK. The debug messages might be more clear if they mentioned whether or not we were omitting the TOAST table, I suppose. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Feb 7, 2012 at 12:43 PM, Robert Haas <robertmhaas@gmail.com> wrote: > I've committed the numeric and varbit patches and will look at the > temporal one next. Committed, after changing the OIDs so they don't conflict. Here's one more case for you to ponder: rhaas=# create table noah (i interval day); CREATE TABLE rhaas=# alter table noah alter column i set data type interval second(3); DEBUG: rewriting table "noah" ALTER TABLE Do we really need a rewrite in that case? The code acts like the interval range and precision are separate beasts, but is that really true? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Feb 08, 2012 at 09:37:01AM -0500, Robert Haas wrote: > On Tue, Feb 7, 2012 at 12:43 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > I've committed the numeric and varbit patches and will look at the > > temporal one next. > > Committed, after changing the OIDs so they don't conflict. > > Here's one more case for you to ponder: > > rhaas=# create table noah (i interval day); > CREATE TABLE > rhaas=# alter table noah alter column i set data type interval second(3); > DEBUG: rewriting table "noah" > ALTER TABLE > > Do we really need a rewrite in that case? The code acts like the > interval range and precision are separate beasts, but is that really > true? The code has a thinko; a given interval typmod ultimately implies a single point from which we truncate rightward. The precision only matters if the range covers SECOND. Thanks; the attached patch improves this.
Attachment
On Thu, Feb 9, 2012 at 7:18 AM, Noah Misch <noah@leadboat.com> wrote: > The code has a thinko; a given interval typmod ultimately implies a single > point from which we truncate rightward. The precision only matters if the > range covers SECOND. Thanks; the attached patch improves this. Thanks, committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company