Thread: List of "binary-compatible" data types
Folks, From our docs: "Adding a column with a non-null default or changing the type of an existing column will require the entire table and indexes to be rewritten. As an exception, if the USING clause does not change the column contents and the old type is either binary coercible to the new type or an unconstrained domain over the new type, a table rewrite is not needed ..." Which is nice, but nowhere do we present users with a set of binary-compatible data types, even among the built-in types. I'd happily write this up, if I knew what the binary-compatible data types *were*. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On 4 November 2013 21:58, Josh Berkus <josh@agliodbs.com> wrote: > Folks, > > From our docs: > > "Adding a column with a non-null default or changing the type of an > existing column will require the entire table and indexes to be > rewritten. As an exception, if the USING clause does not change the > column contents and the old type is either binary coercible to the new > type or an unconstrained domain over the new type, a table rewrite is > not needed ..." > > Which is nice, but nowhere do we present users with a set of > binary-compatible data types, even among the built-in types. I'd > happily write this up, if I knew what the binary-compatible data types > *were*. You could try this: SELECT castsource::regtype::text, array_agg(casttarget::regtype order by casttarget::regtype::text) casttargets FROM pg_cast WHERE castmethod = 'b' GROUP BY 1 ORDER BY 1; -- Thom
Thom, > SELECT > castsource::regtype::text, > array_agg(casttarget::regtype order by casttarget::regtype::text) casttargets > FROM pg_cast > WHERE castmethod = 'b' > GROUP BY 1 > ORDER BY 1; Are we actually covering 100% of these for ALTER COLUMN now? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On 11/04/2013 05:21 PM, Josh Berkus wrote: > Thom, > > >> SELECT >> castsource::regtype::text, >> array_agg(casttarget::regtype order by casttarget::regtype::text) casttargets >> FROM pg_cast >> WHERE castmethod = 'b' >> GROUP BY 1 >> ORDER BY 1; > > Are we actually covering 100% of these for ALTER COLUMN now? Also, JSON <--> Text seems to be missing from the possible binary conversions. That's a TODO, I suppose. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Mon, Nov 04, 2013 at 05:23:36PM -0800, Josh Berkus wrote: > On 11/04/2013 05:21 PM, Josh Berkus wrote: > > Thom, > > > > > >> SELECT > >> castsource::regtype::text, > >> array_agg(casttarget::regtype order by casttarget::regtype::text) casttargets > >> FROM pg_cast > >> WHERE castmethod = 'b' > >> GROUP BY 1 > >> ORDER BY 1; > > > > Are we actually covering 100% of these for ALTER COLUMN now? Yes; ALTER TABLE ALTER TYPE refers to the same metadata as Thom's query. If you add to the list by issuing CREATE CAST ... WITHOUT FUNCTION, ALTER TABLE will respect that, too. > Also, JSON <--> Text seems to be missing from the possible binary > conversions. That's a TODO, I suppose. Only json --> text, not json <-- text. Note that you can add the cast manually if you have an immediate need. -- Noah Misch EnterpriseDB http://www.enterprisedb.com
Noah, >> Also, JSON <--> Text seems to be missing from the possible binary >> conversions. That's a TODO, I suppose. > > Only json --> text, not json <-- text. Note that you can add the cast > manually if you have an immediate need. Huh? Why would text --> JSON require a physical rewrite? We have to validate it, sure, but we don't need to rewrite it. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Tue, Nov 05, 2013 at 10:00:15AM -0800, Josh Berkus wrote: > Noah, > > >> Also, JSON <--> Text seems to be missing from the possible binary > >> conversions. That's a TODO, I suppose. > > > > Only json --> text, not json <-- text. Note that you can add the cast > > manually if you have an immediate need. > > Huh? Why would text --> JSON require a physical rewrite? We have to > validate it, sure, but we don't need to rewrite it. That's all true, but the system has no concept like "this cast validates the data, never changing it". We would first need to add metadata supporting such a concept. On the other hand, "create cast (json as text) without function;" leans only on concepts the system already knows. -- Noah Misch EnterpriseDB http://www.enterprisedb.com
Noah, > That's all true, but the system has no concept like "this cast validates the > data, never changing it". We would first need to add metadata supporting such > a concept. On the other hand, "create cast (json as text) without function;" > leans only on concepts the system already knows. > Yeah, I'm thinking it might be worth coming up with a solution for that specific case. As users upgrade from 9.0 and 9.1 to 9.3, they're going to want to convert their text columns containing JSON to columns of the JSON type, and are going to be surprised how painful that is. Of course, if we get binary JSON in 9.4 (Oleg?), then a binary conversion will be required, so maybe it's a moot point. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On 2013-11-05 11:15:29 -0800, Josh Berkus wrote: > Noah, > > > That's all true, but the system has no concept like "this cast validates the > > data, never changing it". We would first need to add metadata supporting such > > a concept. On the other hand, "create cast (json as text) without function;" > > leans only on concepts the system already knows. > > > > Yeah, I'm thinking it might be worth coming up with a solution for that > specific case. As users upgrade from 9.0 and 9.1 to 9.3, they're going > to want to convert their text columns containing JSON to columns of the > JSON type, and are going to be surprised how painful that is. There's zap chance of doing anything for 9.3, this would require quite a bit of code in tablecmds.c and that surely isn't going to happen in the backbranches. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres, > There's zap chance of doing anything for 9.3, this would require quite a > bit of code in tablecmds.c and that surely isn't going to happen in the > backbranches. Oh, sure, I was thinking of a workaround. Actually, being able to separate "need to check contents" from "need to rewrite values" could be useful for a lot of type conversions. I'd also love some way of doing a no-rewrite conversion between timestamp and timestamptz, based on the assumption that the original values are UTC time. That's one I encounter a lot. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Tue, Nov 05, 2013 at 05:02:58PM -0800, Josh Berkus wrote: > I'd also love some way of doing a no-rewrite conversion between > timestamp and timestamptz, based on the assumption that the original > values are UTC time. That's one I encounter a lot. It was such a conversion that motivated me to add the no-rewrite ALTER TABLE ALTER TYPE support in the first place. Interesting. Support for it didn't end up in any submitted patch due to a formal problem: a protransform function shall only consult IMMUTABLE facts, but we posit that timezone==UTC is a STABLE observation. However, a protransform function can easily simplify the immutable expression "tscol AT TIME ZONE 'UTC'", avoiding a rewrite. See attached patch. Examples: begin; create table t (c timestamptz); set client_min_messages = debug1; -- rewrite: depends on timezone GUC alter table t alter c type timestamp; -- rewrite: depends on timezone GUC alter table t alter c type timestamptz; -- no rewrite: always UTC+0 alter table t alter c type timestamp using c at time zone 'UTC'; -- no rewrite: always UTC+0 alter table t alter c type timestamptz using c at time zone 'Etc/Universal'; -- rewrite: always UTC+0 in the present day, but not historically alter table t alter c type timestamp using c at time zone 'Atlantic/Reykjavik'; -- rewrite: always UTC+0 in the present day, but not historically alter table t alter c type timestamptz using c at time zone 'Africa/Lome'; -- no rewrite: always UTC+0 alter table t alter c type timestamp using c at time zone 'GMT'; -- rewrite: always UTC+1 alter table t alter c type timestamptz using c at time zone '1 hour'::interval; -- no rewrite: always UTC+0 alter table t alter c type timestamp using c at time zone '0 hour'::interval; rollback;
Attachment
On Thu, Feb 05, 2015 at 08:36:18PM -0500, Noah Misch wrote: > On Tue, Nov 05, 2013 at 05:02:58PM -0800, Josh Berkus wrote: > > I'd also love some way of doing a no-rewrite conversion between > > timestamp and timestamptz, based on the assumption that the original > > values are UTC time. That's one I encounter a lot. > > It was such a conversion that motivated me to add the no-rewrite ALTER TABLE > ALTER TYPE support in the first place. Interesting. Support for it didn't > end up in any submitted patch due to a formal problem: a protransform function > shall only consult IMMUTABLE facts, but we posit that timezone==UTC is a > STABLE observation. However, a protransform function can easily simplify the > immutable expression "tscol AT TIME ZONE 'UTC'", avoiding a rewrite. See > attached patch. This (commit b8a18ad) ended up causing wrong EXPLAIN output and wrong indxpath.c processing. Hence, commit c22ecc6 neutralized the optimization; see that commit's threads for details. I pondered ways to solve those problems, but I didn't come up with anything satisfying for EXPLAIN. (One dead-end thought was to introduce an ExprShortcut node having "Node *semantics" and "Node *shortcut" fields, where "semantics" is deparsed for EXPLAIN and "shortcut" is actually evaluated. That would require teaching piles of code about the new node type, which isn't appropriate for the benefit in question.) Stepping back a bit, commit b8a18ad didn't provide a great UI. I doubt folks write queries this way spontaneously; to do so, they would have needed to learn that such syntax enables this optimization. If I'm going to do something more invasive, it should optimize the idiomatic "alter table t alter timestamptzcol type timestamp". One could do that with a facility like SupportRequestSimplify except permitted to consider STABLE facts. I suppose I could add a volatility field to SupportRequestSimplify. So far, I can't think of a second use case for such a facility, so instead I think ATColumnChangeRequiresRewrite() should have a hard-wired call for F_TIMESTAMPTZ_TIMESTAMP and F_TIMESTAMP_TIMESTAMPTZ. Patch attached. If we find more applications of this concept, it shouldn't be hard to migrate this logic into SupportRequestSimplify. Does anyone think that's better to do from the start? Thanks, nm
Attachment
On Tue, 26 Feb 2019 at 06:14, Noah Misch <noah@leadboat.com> wrote:
On Thu, Feb 05, 2015 at 08:36:18PM -0500, Noah Misch wrote:
> On Tue, Nov 05, 2013 at 05:02:58PM -0800, Josh Berkus wrote:
> > I'd also love some way of doing a no-rewrite conversion between
> > timestamp and timestamptz, based on the assumption that the original
> > values are UTC time. That's one I encounter a lot.
>
> It was such a conversion that motivated me to add the no-rewrite ALTER TABLE
> ALTER TYPE support in the first place. Interesting. Support for it didn't
> end up in any submitted patch due to a formal problem: a protransform function
> shall only consult IMMUTABLE facts, but we posit that timezone==UTC is a
> STABLE observation. However, a protransform function can easily simplify the
> immutable expression "tscol AT TIME ZONE 'UTC'", avoiding a rewrite. See
> attached patch.
This (commit b8a18ad) ended up causing wrong EXPLAIN output and wrong
indxpath.c processing. Hence, commit c22ecc6 neutralized the optimization;
see that commit's threads for details. I pondered ways to solve those
problems, but I didn't come up with anything satisfying for EXPLAIN. (One
dead-end thought was to introduce an ExprShortcut node having "Node
*semantics" and "Node *shortcut" fields, where "semantics" is deparsed for
EXPLAIN and "shortcut" is actually evaluated. That would require teaching
piles of code about the new node type, which isn't appropriate for the benefit
in question.)
Stepping back a bit, commit b8a18ad didn't provide a great UI. I doubt folks
write queries this way spontaneously; to do so, they would have needed to
learn that such syntax enables this optimization. If I'm going to do
something more invasive, it should optimize the idiomatic "alter table t alter
timestamptzcol type timestamp". One could do that with a facility like
SupportRequestSimplify except permitted to consider STABLE facts. I suppose I
could add a volatility field to SupportRequestSimplify. So far, I can't think
of a second use case for such a facility, so instead I think
ATColumnChangeRequiresRewrite() should have a hard-wired call for
F_TIMESTAMPTZ_TIMESTAMP and F_TIMESTAMP_TIMESTAMPTZ. Patch attached. If we
find more applications of this concept, it shouldn't be hard to migrate this
logic into SupportRequestSimplify. Does anyone think that's better to do from
the start?
Looks good, would need docs.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Noah Misch <noah@leadboat.com> writes: > Stepping back a bit, commit b8a18ad didn't provide a great UI. I doubt folks > write queries this way spontaneously; to do so, they would have needed to > learn that such syntax enables this optimization. If I'm going to do > something more invasive, it should optimize the idiomatic "alter table t alter > timestamptzcol type timestamp". One could do that with a facility like > SupportRequestSimplify except permitted to consider STABLE facts. I suppose I > could add a volatility field to SupportRequestSimplify. So far, I can't think > of a second use case for such a facility, so instead I think > ATColumnChangeRequiresRewrite() should have a hard-wired call for > F_TIMESTAMPTZ_TIMESTAMP and F_TIMESTAMP_TIMESTAMPTZ. Patch attached. If we > find more applications of this concept, it shouldn't be hard to migrate this > logic into SupportRequestSimplify. Does anyone think that's better to do from > the start? It'd be nice to get the SupportRequestSimplify API correct from the first release, so if there's even a slightly plausible reason for it to support this, I'd be inclined to err in the direction of doing so. On the other hand, if we really can't think of another use-case then a hard-wired fix might be the best way. One thing that we'd have to nail down a bit harder, if we're to add something to the SupportRequestSimplify API, is exactly what the semantics of the weaker check should be. The notion of "stable" has always been a bit squishy, in that it's not totally clear what span of time stability of the function result is being promised over. In the case at hand, for instance, is it really impossible for the timezone GUC to change during the execution of the ALTER TABLE command? You could probably break that if you tried hard enough, though it seems unlikely that anyone would do so accidentally. I also kind of wonder whether this case arises often enough for us to be expending so much effort optimizing it in the first place. No doubt, where one lives colors one's opinion of how likely it is that the timezone GUC is set to UTC ... but my guess is that that's not true in very many installations. regards, tom lane
On Tue, Feb 26, 2019 at 10:46:29AM -0500, Tom Lane wrote: > Noah Misch <noah@leadboat.com> writes: > > Stepping back a bit, commit b8a18ad didn't provide a great UI. I doubt folks > > write queries this way spontaneously; to do so, they would have needed to > > learn that such syntax enables this optimization. If I'm going to do > > something more invasive, it should optimize the idiomatic "alter table t alter > > timestamptzcol type timestamp". One could do that with a facility like > > SupportRequestSimplify except permitted to consider STABLE facts. I suppose I > > could add a volatility field to SupportRequestSimplify. So far, I can't think > > of a second use case for such a facility, so instead I think > > ATColumnChangeRequiresRewrite() should have a hard-wired call for > > F_TIMESTAMPTZ_TIMESTAMP and F_TIMESTAMP_TIMESTAMPTZ. Patch attached. If we > > find more applications of this concept, it shouldn't be hard to migrate this > > logic into SupportRequestSimplify. Does anyone think that's better to do from > > the start? > > It'd be nice to get the SupportRequestSimplify API correct from the first > release, so if there's even a slightly plausible reason for it to support > this, I'd be inclined to err in the direction of doing so. On the other > hand, if we really can't think of another use-case then a hard-wired fix > might be the best way. Is it problematic to add a volatility field later? Older support functions will have needed to assume IMMUTABLE, and a simplification valid for IMMUTABLE is valid for all volatility levels. Still, yes, it would be nice to have from the start if we will indeed need it. > One thing that we'd have to nail down a bit harder, if we're to add > something to the SupportRequestSimplify API, is exactly what the semantics > of the weaker check should be. The notion of "stable" has always been a > bit squishy, in that it's not totally clear what span of time stability > of the function result is being promised over. In the case at hand, for > instance, is it really impossible for the timezone GUC to change during > the execution of the ALTER TABLE command? You could probably break that > if you tried hard enough, though it seems unlikely that anyone would do so > accidentally. No, one certainly can change a GUC during ALTER TABLE execution. The STABLE marking on current_setting is a fiction. I interpret that fiction as a signal that a query has undefined behavior if you change GUCs mid-query, which seems like a prudent level of effort toward such queries. Adding a volatility field to SupportRequestSimplify does invite C-language extension code to participate in deciding this undefined behavior, which would make it harder to verify that we like the undefined behavior of the moment (e.g. doesn't crash). Perhaps best not to overturn that rock?
Noah Misch <noah@leadboat.com> writes: > On Tue, Feb 26, 2019 at 10:46:29AM -0500, Tom Lane wrote: >> It'd be nice to get the SupportRequestSimplify API correct from the first >> release, so if there's even a slightly plausible reason for it to support >> this, I'd be inclined to err in the direction of doing so. > Is it problematic to add a volatility field later? Not hugely so, no. I'm thinking more in terms of support functions having to pay attention to which version they're being compiled for to know what they can do. But I suppose that's little worse than any other feature addition we make at the C API level. >> ... is it really impossible for the timezone GUC to change during >> the execution of the ALTER TABLE command? You could probably break that >> if you tried hard enough, though it seems unlikely that anyone would do so >> accidentally. > No, one certainly can change a GUC during ALTER TABLE execution. The STABLE > marking on current_setting is a fiction. I interpret that fiction as a signal > that a query has undefined behavior if you change GUCs mid-query, which seems > like a prudent level of effort toward such queries. Adding a volatility field > to SupportRequestSimplify does invite C-language extension code to participate > in deciding this undefined behavior, which would make it harder to verify that > we like the undefined behavior of the moment (e.g. doesn't crash). Perhaps > best not to overturn that rock? Probably not, unless we can come up with more convincing use-cases for it. For the moment, I'm content with the approach in the patch you posted. regards, tom lane
On Tue, Feb 26, 2019 at 02:29:01PM +0000, Simon Riggs wrote: > Looks good, would need docs. The ALTER TABLE page just says "old type is either binary coercible to the new type or an unconstrained domain over the new type." Avoiding rewrites by way of a prosupport function or the at-timestamp-v2.patch approach is essentially another way to achieve binary coercion. So far, we haven't documented the individual data types affected. Since we don't mention e.g. varchar(x) -> varchar(x+k) explicitly, I plan not to mention timestamp explicitly.