Thread: [HACKERS] Replication vs. float timestamps is a disaster
Both the streaming replication and logical replication areas of the code are, approximately, utterly broken when !HAVE_INT64_TIMESTAMPS. (The fact that "make check-world" passes anyway is an indictment of the quality of the regression tests.) I started poking around in this area after Thomas Munro reported that pg_recvlogical.c no longer even compiles with --disable-integer-datetimes, but the problems go much deeper than that. replication/logical/worker.c's send_feedback() is sending a backend timestamp directly as sendTime. This is wrong per the protocol spec, which says the field is an integer timestamp. I'm not sure if it's OK to just replace - pq_sendint64(reply_message, now); /* sendTime */ + pq_sendint64(reply_message, GetCurrentIntegerTimestamp()); /* sendTime */ ... is it important for the sent timestamp to exactly match the timestamp that was used earlier in the function for deciding whether to send or not? Should we instead try to convert the earlier logic to use integer timestamps? As for logical replication, logicalrep_write_begin and logicalrep_write_commit are likewise sending TimestampTz values using pq_sendint64, and this is much harder to patch locally since the values were taken from WAL records that contain TimestampTz. Although the sent values are nonsensical according to the protocol spec, logicalrep_read_begin and logicalrep_read_commit read them with pq_getmsgint64 and assign the results back to TimestampTz, which means that things sort of appear to work as long as you don't mind losing all sub-second precision in the timestamps. (But a decoder that was written to spec would think the values are garbage.) I'm inclined to think that at least for logical replication, we're going to have to give up the protocol spec wording that says that timestamps are integers, and instead admit that they are dependent on disable-integer-datetimes. Maybe we should do the same in the streaming replication protocol, because this area is going to be a very fertile source of bugs for the foreseeable future if we don't. It's not like replication between enable-integer-datetimes and disable-integer-datetimes hosts has any chance of being useful anyway. Thoughts? Should we double down on trying to make this work according to the "all integer timestamps" protocol specs, or cut our losses and change the specs? regards, tom lane
On Sun, Feb 19, 2017 at 3:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thoughts? Should we double down on trying to make this work according > to the "all integer timestamps" protocol specs, or cut our losses and > change the specs? I vote for doubling down. It's bad enough that we have so many internal details that depend on this setting; letting that cascade into the wire protocol seems like it's just letting the chaos spread farther and wider. Also, I wonder if we could consider deprecating and removing --disable-integer-datetimes at some point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Sun, Feb 19, 2017 at 3:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Thoughts? Should we double down on trying to make this work according >> to the "all integer timestamps" protocol specs, or cut our losses and >> change the specs? > I vote for doubling down. It's bad enough that we have so many > internal details that depend on this setting; letting that cascade > into the wire protocol seems like it's just letting the chaos spread > farther and wider. How do you figure that it's not embedded in the wire protocol already? Not only the replicated data for a timestamp column, but also the client-visible binary I/O format, depend on this. I think having some parts of the protocol use a different timestamp format than other parts is simply weird, and as this exercise has shown, it's bug-prone as all get out. > Also, I wonder if we could consider deprecating and removing > --disable-integer-datetimes at some point. Seems like a different conversation. Although given the lack of replication bug reports so far, maybe nobody is using --disable-integer-datetimes anymore. Certainly, fixing these bugs by removing the --disable-integer-datetimes option would be a lot less painful than trying to make it actually work per protocol spec. regards, tom lane
On Sun, Feb 19, 2017 at 9:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Sun, Feb 19, 2017 at 3:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Thoughts? Should we double down on trying to make this work according >>> to the "all integer timestamps" protocol specs, or cut our losses and >>> change the specs? > >> I vote for doubling down. It's bad enough that we have so many >> internal details that depend on this setting; letting that cascade >> into the wire protocol seems like it's just letting the chaos spread >> farther and wider. > > How do you figure that it's not embedded in the wire protocol already? > Not only the replicated data for a timestamp column, but also the > client-visible binary I/O format, depend on this. I think having some > parts of the protocol use a different timestamp format than other parts > is simply weird, and as this exercise has shown, it's bug-prone as all > get out. You've got a point, but if we don't make the replication protocol consistently use integers, then every client that cares about those fields has to be able to handle either format, or at least detect that it got a different format than it was expecting and fail cleanly. How's that going to work? Also, I feel like there's some difference between the data stored in a cluster and the metadata which describes the cluster. Making the interpretation of the metadata depend on the cluster configuration feels circular, like using non-ASCII characters in the name of an encoding. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2017-02-19 10:49:29 -0500, Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > On Sun, Feb 19, 2017 at 3:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Thoughts? Should we double down on trying to make this work according > >> to the "all integer timestamps" protocol specs, or cut our losses and > >> change the specs? > > > I vote for doubling down. It's bad enough that we have so many > > internal details that depend on this setting; letting that cascade > > into the wire protocol seems like it's just letting the chaos spread > > farther and wider. > > How do you figure that it's not embedded in the wire protocol already? > Not only the replicated data for a timestamp column, but also the > client-visible binary I/O format, depend on this. I think having some > parts of the protocol use a different timestamp format than other parts > is simply weird, and as this exercise has shown, it's bug-prone as all > get out. I don't think it's that closely tied together atm. Things like pg_basebackup, pg_receivexlog etc should work, without having to match timestamp storage. Logical replication, unless your output plugin dumps data in binary / "raw" output, also works just fine across the timestamp divide. It doesn't sound that hard to add a SystemToIntTimestamp() function, given it only needs to do something if float timestamps are enabled? Regards, Andres
On 20/02/17 08:03, Andres Freund wrote: > On 2017-02-19 10:49:29 -0500, Tom Lane wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> On Sun, Feb 19, 2017 at 3:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> Thoughts? Should we double down on trying to make this work according >>>> to the "all integer timestamps" protocol specs, or cut our losses and >>>> change the specs? >> >>> I vote for doubling down. It's bad enough that we have so many >>> internal details that depend on this setting; letting that cascade >>> into the wire protocol seems like it's just letting the chaos spread >>> farther and wider. >> >> How do you figure that it's not embedded in the wire protocol already? >> Not only the replicated data for a timestamp column, but also the >> client-visible binary I/O format, depend on this. I think having some >> parts of the protocol use a different timestamp format than other parts >> is simply weird, and as this exercise has shown, it's bug-prone as all >> get out. > > I don't think it's that closely tied together atm. Things like > pg_basebackup, pg_receivexlog etc should work, without having to match > timestamp storage. Logical replication, unless your output plugin dumps > data in binary / "raw" output, also works just fine across the timestamp > divide. > > It doesn't sound that hard to add a SystemToIntTimestamp() function, > given it only needs to do something if float timestamps are enabled? > It's definitely not hard, we already have IntegerTimestampToTimestampTz() which does the opposite conversion anyway. That being said, I did wonder myself if we should just deprecate float timestamps as well. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2017-02-20 11:58:12 +0100, Petr Jelinek wrote: > That being said, I did wonder myself if we should just deprecate float > timestamps as well. I think we need a proper deprecation period for that, given that the conversion away will be painful for pg_upgrade using people with big clusters. So I think we should fix this regardless... :(
On 20/02/17 12:04, Andres Freund wrote: > On 2017-02-20 11:58:12 +0100, Petr Jelinek wrote: >> That being said, I did wonder myself if we should just deprecate float >> timestamps as well. > > I think we need a proper deprecation period for that, given that the > conversion away will be painful for pg_upgrade using people with big > clusters. So I think we should fix this regardless... :( > That's a good point. Attached should fix the logical replication problems. I am not quite sure if there is anything in physical that needs changing. I opted for GetCurrentIntegerTimestamp() in the reply code as that's the same coding walreceiver uses. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Andres Freund <andres@anarazel.de> writes: > On 2017-02-20 11:58:12 +0100, Petr Jelinek wrote: >> That being said, I did wonder myself if we should just deprecate float >> timestamps as well. > I think we need a proper deprecation period for that, given that the > conversion away will be painful for pg_upgrade using people with big > clusters. So I think we should fix this regardless... :( The question to be asked is whether there is still anybody out there using float timestamps. I'm starting to get dubious about it myself. Certainly, no packager that I'm aware of has shipped a float-timestamp build since we switched the default in 8.4. Maybe there is somebody who's faithfully built a float-timestamp custom build every year so they can pg_upgrade in place from their 8.3 installation, but there have got to be darn few such people. As for "proper deprecation period", the documentation has called the option deprecated since 8.4: -disable-integer-datetimesDisable support for 64-bit integer storage for timestamps andintervals, and store datetime valuesas floating-point numbersinstead. Floating-point datetime storage was the default inPostgreSQL releases prior to 8.4,but it is now deprecated,because it does not support microsecond precision for the fullrange of timestamp values. I think the strongest reason why we didn't move to kill it sooner was that we were not then assuming that every platform had 64-bit ints; but we've required that since 9.0. regards, tom lane
Petr Jelinek <petr.jelinek@2ndquadrant.com> writes: > It's definitely not hard, we already have > IntegerTimestampToTimestampTz() which does the opposite conversion anyway. It's not the functions that are hard, it's making sure that you have used them in the correct places, and declared relevant variables with the appropriate types. Evidence on the ground is that that is very hard; I have little faith that we'll get it right without compiler support. regards, tom lane
On Mon, Feb 20, 2017 at 7:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > The question to be asked is whether there is still anybody out there > using float timestamps. I'm starting to get dubious about it myself. > Certainly, no packager that I'm aware of has shipped a float-timestamp > build since we switched the default in 8.4. Maybe there is somebody > who's faithfully built a float-timestamp custom build every year so they > can pg_upgrade in place from their 8.3 installation, but there have got > to be darn few such people. I'm wondering if it has any effect that pg_config.h.win32 says /* Define to 1 if you want 64-bit integer timestamp and interval support. (--enable-integer-datetimes) */ /* #undef USE_INTEGER_DATETIMES */ Whereas pg_config.h.win32 says: /* Define to 1 if you want 64-bit integer timestamp and interval support. (--enable-integer-datetimes) */ #define USE_INTEGER_DATETIMES 1 It looks like it was commit 2169e42bef9db7e0bdd1bea00b81f44973ad83c8 that enabled integer datetimes by default, but that commit seems to not to have touched the Windows build scripts. Commit fcf053d7829f2d83829256153e856f9a36c83ffd changed MSVC over to use integer datetimes by default, but I'm not clear if there's any build environment where we rely on config.h.win32 but not Solution.pm? If not, what exactly is pg_config.h.win32 for and to what degree does it need to be in sync with pg_config.h.in? The list of differences appears to be far more extensive than the header comment at the top of pg_config.h.win32 would lead one to believe. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Feb 20, 2017 at 7:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> The question to be asked is whether there is still anybody out there >> using float timestamps. I'm starting to get dubious about it myself. > I'm wondering if it has any effect that pg_config.h.win32 says > /* Define to 1 if you want 64-bit integer timestamp and interval support. > (--enable-integer-datetimes) */ > /* #undef USE_INTEGER_DATETIMES */ > Whereas pg_config.h.win32 says: > /* Define to 1 if you want 64-bit integer timestamp and interval support. > (--enable-integer-datetimes) */ > #define USE_INTEGER_DATETIMES 1 Er, what? For me, grep finds src/include/pg_config.h.in: 836: #undef USE_INTEGER_DATETIMES src/include/pg_config.h.win32: 630: /* #undef USE_INTEGER_DATETIMES */ > It looks like it was commit 2169e42bef9db7e0bdd1bea00b81f44973ad83c8 > that enabled integer datetimes by default, but that commit seems to > not to have touched the Windows build scripts. Commit > fcf053d7829f2d83829256153e856f9a36c83ffd changed MSVC over to use > integer datetimes by default, but I'm not clear if there's any build > environment where we rely on config.h.win32 but not Solution.pm? Any such build would find itself without a defined value of BLCKSZ, to mention just one of the settings that get appended by Solution.pm. It does look like we expect pg_config.h.win32 to be usable standalone for libpq-only Windows compiles, but it would never work for building the backend. (I dunno whether the libpq-only scripts still work at all or have bitrotted, but it's irrelevant for the question at hand.) > If not, what exactly is pg_config.h.win32 for and to what degree does it > need to be in sync with pg_config.h.in? The list of differences > appears to be far more extensive than the header comment at the top of > pg_config.h.win32 would lead one to believe. Yeah, I think the maintenance of pg_config.h.win32 has been a bit more haphazard than one could wish. regards, tom lane
>>>>> "TL" == Tom Lane <tgl@sss.pgh.pa.us> writes: TL> The question to be asked is whether there is still anybody out there TL> using float timestamps. Gentoo's ebuild includes: $(use_enable !pg_legacytimestamp integer-datetimes) \ meaning that by default --enable-integer-datetimes is passed to configure, but if the pg_legacytimestamp use flag is set, then --disable-integer-datetimes is passed instead. They document it as: <flag name="pg_legacytimestamp"> Use double precision floating-point numbers instead of 64-bit integers fortimestamp storage. </flag> Ie, w/o any kind of deprecation notice. I don't know how many (how few?) add pg_legacytimestamp to USE when merging postgresql. But it is still available as of 9.6 and also with their live build of git://git.postgresql.org/git/postgresql.git. -JimC -- James Cloos <cloos@jhcloos.com> OpenPGP: 0x997A9F17ED7DAEA6
On 2/21/17 4:52 PM, James Cloos wrote: >>>>>> "TL" == Tom Lane <tgl@sss.pgh.pa.us> writes: > TL> The question to be asked is whether there is still anybody out there > TL> using float timestamps. > > Gentoo's ebuild includes: > > $(use_enable !pg_legacytimestamp integer-datetimes) \ FWIW, last time I looked it was also an option in FreeBSD's ports, though I think it's defaulted to int since forever ago (like, 7.4 era). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
On 2/20/17 5:04 AM, Andres Freund wrote: > On 2017-02-20 11:58:12 +0100, Petr Jelinek wrote: >> That being said, I did wonder myself if we should just deprecate float >> timestamps as well. > > I think we need a proper deprecation period for that, given that the > conversion away will be painful for pg_upgrade using people with big > clusters. So I think we should fix this regardless... :( I wounder if a separate "floatstamp" data type might fit the bill there. It might not be completely seamless, but it would be binary compatible. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
On 2017-02-22 00:10:35 -0600, Jim Nasby wrote: > On 2/20/17 5:04 AM, Andres Freund wrote: > > On 2017-02-20 11:58:12 +0100, Petr Jelinek wrote: > > > That being said, I did wonder myself if we should just deprecate float > > > timestamps as well. > > > > I think we need a proper deprecation period for that, given that the > > conversion away will be painful for pg_upgrade using people with big > > clusters. So I think we should fix this regardless... :( > > I wounder if a separate "floatstamp" data type might fit the bill there. It > might not be completely seamless, but it would be binary compatible. I don't really see what'd that solve. - Andres
Andres Freund <andres@anarazel.de> writes: > On 2017-02-22 00:10:35 -0600, Jim Nasby wrote: >> I wounder if a separate "floatstamp" data type might fit the bill there. It >> might not be completely seamless, but it would be binary compatible. > I don't really see what'd that solve. Seems to me this is a different name for what I already tried in <27694.1487456324@sss.pgh.pa.us>. It would be much better than doing nothing, IMO, but it would still leave lots of opportunities for mistakes. To review the bidding a bit, it seems to me we've got four possible ways of dealing with this issue: 1. Just patch the already-identified float-vs-int-timestamp bugs in as localized a fashion as possible, and hope that there aren't any more and that we never introduce any more. I find this hope foolish :-(, especially in view of the fact that what started this discussion is a newly-introduced bug of this ilk. 2. Introduce some new notation, perhaps <27694.1487456324@sss.pgh.pa.us> plus new "sendtimestamp" and "recvtimestamp" functions, to try to provide some compiler-assisted support for getting it right. 3. Give up on "protocol timestamps are always integer style". 4. Give up on float timestamps altogether. While I'm generally not one to vote for dropping backwards-compatibility features, I have to say that I find #4 the most attractive of these options. It would result in getting rid of boatloads of under-tested code, whereas #2 would really just add more, and #3 at best maintains the status quo complexity-wise. I think it was clear from the day we switched to integer timestamps as the default that the days of float timestamps were numbered, and that we were only going to continue to support them as long as it wasn't costing a lot of effort. We have now reached a point at which it's clear that continuing to support them will have direct and significant costs. I say it's time to pull the plug. (To be concrete, I'm suggesting dropping --disable-integer-datetimes in HEAD, and just agreeing that in the back branches, use of replication protocol with float-timestamp servers is not supported and we're not going to bother looking for related bugs there. Given the lack of field complaints, I do not believe anyone cares.) regards, tom lane
On 2017-02-22 08:43:28 -0500, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2017-02-22 00:10:35 -0600, Jim Nasby wrote: > >> I wounder if a separate "floatstamp" data type might fit the bill there. It > >> might not be completely seamless, but it would be binary compatible. > > > I don't really see what'd that solve. > > Seems to me this is a different name for what I already tried in > <27694.1487456324@sss.pgh.pa.us>. It would be much better than doing > nothing, IMO, but it would still leave lots of opportunities for mistakes. It sounded more like Jim suggested a full blown SQL type, given that he replied to my concern about the possible need for a deprecation period due to pg_upgrade concerns. To be useful for that, we'd need a good chunk of magic, so all existing uses of timestamp[tz] are replaced with floattimestamp[tz], duplicate some code, add implicit casts, and accept that composites/arrays won't be fixed. That sounds like a fair amount of work to me, and we'd still have no way to remove the code without causing pain. > 1. Just patch the already-identified float-vs-int-timestamp bugs in as > localized a fashion as possible, and hope that there aren't any more and > that we never introduce any more. I find this hope foolish :-(, > especially in view of the fact that what started this discussion is a > newly-introduced bug of this ilk. Well, the newly introduced bug was just a copy of the existing code. I'm far less negative about this than you - we're not dealing with a whole lot of code here. I don't get why it'd be foolish to verify the limited amount of code dealing with replication protocol timestamp. > 4. Give up on float timestamps altogether. > > While I'm generally not one to vote for dropping backwards-compatibility > features, I have to say that I find #4 the most attractive of these > options. It would result in getting rid of boatloads of under-tested > code, whereas #2 would really just add more, and #3 at best maintains > the status quo complexity-wise. > (To be concrete, I'm suggesting dropping --disable-integer-datetimes > in HEAD, and just agreeing that in the back branches, use of replication > protocol with float-timestamp servers is not supported and we're not > going to bother looking for related bugs there. Given the lack of field > complaints, I do not believe anyone cares.) I think we should just fix it in the back branches, and drop the float timestamp code in 10 or 11. I.e. 1) and then 4). Greetings, Andres Freund
Tom, all, * Tom Lane (tgl@sss.pgh.pa.us) wrote: > While I'm generally not one to vote for dropping backwards-compatibility > features, I have to say that I find #4 the most attractive of these > options. It would result in getting rid of boatloads of under-tested > code, whereas #2 would really just add more, and #3 at best maintains > the status quo complexity-wise. +1. Thanks! Stephen
Stephen Frost <sfrost@snowman.net> writes: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> While I'm generally not one to vote for dropping backwards-compatibility >> features, I have to say that I find #4 the most attractive of these >> options. It would result in getting rid of boatloads of under-tested >> code, whereas #2 would really just add more, and #3 at best maintains >> the status quo complexity-wise. > +1. BTW, I did a bit of digging in the archives, and found a couple of previous discussions around this: https://www.postgresql.org/message-id/flat/1178476313.18303.164.camel%40goldbach https://www.postgresql.org/message-id/flat/1287334597.8516.372.camel%40jdavis as well as numerous discussions of specific bugs associated with float timestamps, eg this isn't even the first time we've hit a float-timestamp replication bug: https://www.postgresql.org/message-id/flat/CAHGQGwFbqTDBrw95jek6_RvYG2%3DE-6o0HOpbeEcP6oWHJTLkUw%40mail.gmail.com In one or another of those threads, I opined that we'd have to keep float timestamps available for as long as we were supporting pg_upgrade from 8.3. However, I think that that argument hasn't really stood the test of time, because of the lack of packagers shipping builds with float timestamps turned on. There may be some number of installations that still have float timestamps active, but it's got to be a tiny number. And the list of reasons not to like float timestamps is very long. regards, tom lane
On 2/22/17 7:56 AM, Andres Freund wrote: > On 2017-02-22 08:43:28 -0500, Tom Lane wrote: >> Andres Freund <andres@anarazel.de> writes: >>> On 2017-02-22 00:10:35 -0600, Jim Nasby wrote: >>>> I wounder if a separate "floatstamp" data type might fit the bill there. It >>>> might not be completely seamless, but it would be binary compatible. >>> I don't really see what'd that solve. >> Seems to me this is a different name for what I already tried in >> <27694.1487456324@sss.pgh.pa.us>. It would be much better than doing >> nothing, IMO, but it would still leave lots of opportunities for mistakes. > It sounded more like Jim suggested a full blown SQL type, given that he > replied to my concern about the possible need for a deprecation period > due to pg_upgrade concerns. To be useful for that, we'd need a good > chunk of magic, so all existing uses of timestamp[tz] are replaced with > floattimestamp[tz], duplicate some code, add implicit casts, and accept > that composites/arrays won't be fixed. That sounds like a fair amount > of work to me, and we'd still have no way to remove the code without > causing pain. Right, but I was thinking more in line with just providing the type (as an extension, perhaps not even in core) and making it possible for pg_upgrade to switch fields over to that type. That would allow an in-place upgrade of a really large cluster. A user would still need to modify their code to use the new type. Put another way: add ability for pg_upgrade to change the type of a field. There might be other uses for that as well. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
On 2017-02-22 09:06:38 -0600, Jim Nasby wrote: > On 2/22/17 7:56 AM, Andres Freund wrote: > > It sounded more like Jim suggested a full blown SQL type, given that he > > replied to my concern about the possible need for a deprecation period > > due to pg_upgrade concerns. To be useful for that, we'd need a good > > chunk of magic, so all existing uses of timestamp[tz] are replaced with > > floattimestamp[tz], duplicate some code, add implicit casts, and accept > > that composites/arrays won't be fixed. That sounds like a fair amount > > of work to me, and we'd still have no way to remove the code without > > causing pain. > > Right, but I was thinking more in line with just providing the type (as an > extension, perhaps not even in core) and making it possible for pg_upgrade > to switch fields over to that type. Isn't the above what you'd need to do that? > That would allow an in-place upgrade of > a really large cluster. A user would still need to modify their code to use > the new type. > > Put another way: add ability for pg_upgrade to change the type of a field. > There might be other uses for that as well. Type oids are unfortunately embedded into composite and array type data - we can do such changes for columns themselves, but it doesn't work if there's any array/composite members containing the to-be-changed type that are used as columns. - Andres
On 2/22/17 9:12 AM, Andres Freund wrote: >> That would allow an in-place upgrade of >> a really large cluster. A user would still need to modify their code to use >> the new type. >> >> Put another way: add ability for pg_upgrade to change the type of a field. >> There might be other uses for that as well. > Type oids are unfortunately embedded into composite and array type data > - we can do such changes for columns themselves, but it doesn't work if > there's any array/composite members containing the to-be-changed type > that are used as columns. Only in the catalog though, not the datums, right? I would think you could just change the oid in the catalog the same as you would for a table column. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
On 02/22/2017 10:21 AM, Jim Nasby wrote: > On 2/22/17 9:12 AM, Andres Freund wrote: >>> That would allow an in-place upgrade of >>> a really large cluster. A user would still need to modify their code >>> to use >>> the new type. >>> >>> Put another way: add ability for pg_upgrade to change the type of a >>> field. >>> There might be other uses for that as well. >> Type oids are unfortunately embedded into composite and array type data >> - we can do such changes for columns themselves, but it doesn't work if >> there's any array/composite members containing the to-be-changed type >> that are used as columns. > > Only in the catalog though, not the datums, right? I would think you > could just change the oid in the catalog the same as you would for a > table column. No, in the datums. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: > On 02/22/2017 10:21 AM, Jim Nasby wrote: >> Only in the catalog though, not the datums, right? I would think you >> could just change the oid in the catalog the same as you would for a >> table column. > No, in the datums. Yeah, I don't see any way that we could fob off float timestamps to an extension that would be completely transparent at the pg_upgrade level. And even a partial solution would be an enormous amount of fundamentally dead-end work. There has never been any project policy that promises everybody will be able to pg_upgrade until the end of time. I think it's not unreasonable for us to say that anyone still using float timestamps has to go through a dump and reload to get to v10. The original discussion about getting rid of them was ten years ago come May; that seems long enough. regards, tom lane
On Mon, Feb 20, 2017 at 11:58:12AM +0100, Petr Jelinek wrote: > On 20/02/17 08:03, Andres Freund wrote: > > On 2017-02-19 10:49:29 -0500, Tom Lane wrote: > >> Robert Haas <robertmhaas@gmail.com> writes: > >>> On Sun, Feb 19, 2017 at 3:31 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >>>> Thoughts? Should we double down on trying to make this work according > >>>> to the "all integer timestamps" protocol specs, or cut our losses and > >>>> change the specs? > >> > >>> I vote for doubling down. It's bad enough that we have so many > >>> internal details that depend on this setting; letting that cascade > >>> into the wire protocol seems like it's just letting the chaos spread > >>> farther and wider. > >> > >> How do you figure that it's not embedded in the wire protocol already? > >> Not only the replicated data for a timestamp column, but also the > >> client-visible binary I/O format, depend on this. I think having some > >> parts of the protocol use a different timestamp format than other parts > >> is simply weird, and as this exercise has shown, it's bug-prone as all > >> get out. > > > > I don't think it's that closely tied together atm. Things like > > pg_basebackup, pg_receivexlog etc should work, without having to match > > timestamp storage. Logical replication, unless your output plugin dumps > > data in binary / "raw" output, also works just fine across the timestamp > > divide. > > > > It doesn't sound that hard to add a SystemToIntTimestamp() function, > > given it only needs to do something if float timestamps are enabled? > > > > It's definitely not hard, we already have > IntegerTimestampToTimestampTz() which does the opposite conversion anyway. > > That being said, I did wonder myself if we should just deprecate float > timestamps as well. +1 for deprecating them. If we need a timestamp(tz) with a wider range, we are getting options we didn't have before for implementing it. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Andres Freund <andres@anarazel.de> writes: > On 2017-02-22 08:43:28 -0500, Tom Lane wrote: >> (To be concrete, I'm suggesting dropping --disable-integer-datetimes >> in HEAD, and just agreeing that in the back branches, use of replication >> protocol with float-timestamp servers is not supported and we're not >> going to bother looking for related bugs there. Given the lack of field >> complaints, I do not believe anyone cares.) > I think we should just fix it in the back branches, and drop the float > timestamp code in 10 or 11. I.e. 1) and then 4). I'm not personally interested in touching this issue in the back branches, but if you want to spend time on it, I surely won't stand in your way. What I *am* willing to spend time on is removing float-timestamp code in HEAD. I've not yet heard anybody speak against doing that (or at least, nothing I interpreted as a vote against it). If I've not heard any complaints by tomorrow, I'll get started on that. I envision the following work plan: * Reject --disable-integer-datetimes in configure. To avoid breaking existing build scripts unnecessarily, --enable-integer-datetimes will still be accepted. * Convert HAVE_INT64_TIMESTAMP to a fixed, always-defined symbol. (We shouldn't remove it entirely because that would break third-party code that might be testing it. Perhaps shove it in pg_config_manual.h.) * Possibly remove the enableIntTimes field from pg_control as well as associated logic in places like pg_controldata. There might be some argument for keeping this, though ... anyone have an opinion? pg_upgrade, at least, would need a bespoke test instead. * Remove appropriate user documentation. * Remove all core-code tests for HAVE_INT64_TIMESTAMP, along with the negative sides of those #if tests. * Change the places in the replication code that declare timestamp variables to declare them as TimestampTz not int64, and adjust nearby comments accordingly. (This will just be code beautification at this point, but it still seems like a good idea.) regards, tom lane
On Mon, Feb 20, 2017 at 09:07:33AM -0500, Tom Lane wrote: > The question to be asked is whether there is still anybody out there > using float timestamps. I'm starting to get dubious about it myself. > Certainly, no packager that I'm aware of has shipped a float-timestamp > build since we switched the default in 8.4. Maybe there is somebody > who's faithfully built a float-timestamp custom build every year so they > can pg_upgrade in place from their 8.3 installation, but there have got > to be darn few such people. > > As for "proper deprecation period", the documentation has called the > option deprecated since 8.4: > > -disable-integer-datetimes > Disable support for 64-bit integer storage for timestamps and > intervals, and store datetime values as floating-point numbers > instead. Floating-point datetime storage was the default in > PostgreSQL releases prior to 8.4, but it is now deprecated, > because it does not support microsecond precision for the full > range of timestamp values. > > I think the strongest reason why we didn't move to kill it sooner was > that we were not then assuming that every platform had 64-bit ints; > but we've required that since 9.0. I agree with removing support for float timestamps in PG 10. It would be tempting to think we can reuse the bits we stopped using in 9.0 for VACUUM FULL, e.g.: #define HEAP_MOVED_OFF 0x4000 /* moved to another place by pre-9.0 * VACUUMFULL; kept for binary * upgrade support */#define HEAP_MOVED_IN 0x8000 /* moved from another place by pre-9.0 * VACUUM FULL; kept for binary * upgrade support */ However, if users built Postgres 8.4 with integer timestamps, they could still be in the data files. Does anyone see a fault in my logic, I hope? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On 02/22/2017 02:45 PM, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: >> On 2017-02-22 08:43:28 -0500, Tom Lane wrote: >>> (To be concrete, I'm suggesting dropping --disable-integer-datetimes >>> in HEAD, and just agreeing that in the back branches, use of replication >>> protocol with float-timestamp servers is not supported and we're not >>> going to bother looking for related bugs there. Given the lack of field >>> complaints, I do not believe anyone cares.) > > What I *am* willing to spend time on is removing float-timestamp code > in HEAD. I've not yet heard anybody speak against doing that (or at > least, nothing I interpreted as a vote against it). If I've not heard > any complaints by tomorrow, I'll get started on that. Rip it out! Sincerely, JD -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them. Unless otherwise stated, opinions are my own.
On 2017-02-27 17:00:23 -0800, Joshua D. Drake wrote: > On 02/22/2017 02:45 PM, Tom Lane wrote: > > Andres Freund <andres@anarazel.de> writes: > > > On 2017-02-22 08:43:28 -0500, Tom Lane wrote: > > > > (To be concrete, I'm suggesting dropping --disable-integer-datetimes > > > > in HEAD, and just agreeing that in the back branches, use of replication > > > > protocol with float-timestamp servers is not supported and we're not > > > > going to bother looking for related bugs there. Given the lack of field > > > > complaints, I do not believe anyone cares.) > > > > What I *am* willing to spend time on is removing float-timestamp code > > in HEAD. I've not yet heard anybody speak against doing that (or at > > least, nothing I interpreted as a vote against it). If I've not heard > > any complaints by tomorrow, I'll get started on that. > > Rip it out! Already happened: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6aa17e0ae367afdcea07118e016111af4fa6bc3
Hi, I know I'm 7 months late to this, but only just read the beta 4 release notes. Is there anything people using float datetimes can do that isn't a pg_dumpall | pg_restore to do a less painful update? We have several TB of data still using float datetimes and I'm trying to figure out how we can move to 10 (currently on 9.6.x) without massive amounts of $ in duplicated hardware or downtime. I did attempt a pg_dumpall | pg_restore at one point but for whatever reason we had data in tables that integer datetimes fails on (I forget the exact crash, but the datetime values were either too small or too large to fit into the integer datetimes field -- I can retry this if it would be helpful). Thanks. Regards, Omar On Mon, Feb 27, 2017 at 5:13 PM, Andres Freund <andres@anarazel.de> wrote: > On 2017-02-27 17:00:23 -0800, Joshua D. Drake wrote: >> On 02/22/2017 02:45 PM, Tom Lane wrote: >> > Andres Freund <andres@anarazel.de> writes: >> > > On 2017-02-22 08:43:28 -0500, Tom Lane wrote: >> > > > (To be concrete, I'm suggesting dropping --disable-integer-datetimes >> > > > in HEAD, and just agreeing that in the back branches, use of replication >> > > > protocol with float-timestamp servers is not supported and we're not >> > > > going to bother looking for related bugs there. Given the lack of field >> > > > complaints, I do not believe anyone cares.) >> > >> > What I *am* willing to spend time on is removing float-timestamp code >> > in HEAD. I've not yet heard anybody speak against doing that (or at >> > least, nothing I interpreted as a vote against it). If I've not heard >> > any complaints by tomorrow, I'll get started on that. >> >> Rip it out! > > Already happened: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6aa17e0ae367afdcea07118e016111af4fa6bc3 > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
Omar Kilani <omar.kilani@gmail.com> writes: > Is there anything people using float datetimes can do that isn't a > pg_dumpall | pg_restore to do a less painful update? Um, not really. You may be stuck on 9.6 until you can spare the effort to convert. The physical representations of timestamps are totally different in the two cases. > I did attempt a pg_dumpall | pg_restore at one point but for whatever > reason we had data in tables that integer datetimes fails on (I forget > the exact crash, but the datetime values were either too small or too > large to fit into the integer datetimes field -- I can retry this if > it would be helpful). I'm pretty sure the minimum values are the same in both cases, to wit Julian day zero. As for the max, according to the old code comments * The upper limit for dates is 5874897-12-31, which is a bit less than what* the Julian-date code can allow. We use thatsame limit for timestamps when* using floating-point timestamps ... For integer timestamps, the upper* limit is 294276-12-31. I would hope that any timestamps you've got beyond 294276AD are erroneous data that you need to clean up anyway. regards, tom lane
On 09/06/17 18:33, Omar Kilani wrote: > Is there anything people using float datetimes can do that isn't a > pg_dumpall | pg_restore to do a less painful update? > > We have several TB of data still using float datetimes and I'm trying > to figure out how we can move to 10 (currently on 9.6.x) without > massive amounts of $ in duplicated hardware or downtime. I ran into an analogous issue with endianness of PL/Java-defined datatypes, and ended up devising a procedure [1] for treating the type one way on output and another way on input, and then constructing an UPDATE query that just assigns the column to itself using a function that's essentially identity, but forces the output-input conversion (and doesn't look like a simple identity function to the query optimizer, which otherwise might optimize the whole update away). While the details of that were specific to PL/Java and byte order, something similar might work in your case if you can afford some period, either just before or just after migration, when your normal workload is blocked, long enough to run such updates on your several TB of data. Or if that's too big a chunk of downtime, you might be able to add a column in advance, of some user-defined type the same width as an integer datetime, populate that in advance, migrate, then do a quick catalog switcheroo of the column names and types. I've never done that, so someone else might have comments on its feasibility or the best way of doing it. > the exact crash, but the datetime values were either too small or too > large to fit into the integer datetimes field -- I can retry this if That does seem important to get the details on. If the integer datetime column won't represent your stuff (and the too-large or too-small values are necessary to represent), then some schema change might be necessary. Or, if the outlying values were sentinel values of some kind (I wonder, how else would you even have datetimes outside of the int64 range?), maybe they can be replaced with others that fit. -Chap [1]: https://tada.github.io/pljava/use/byteordermigrate.html