Thread: 9.4 broken on alpha
Hi, >From the Debian ports buildd: https://buildd.debian.org/status/fetch.php?pkg=postgresql-9.4&arch=alpha&ver=9.4.4-1&stamp=1434132509 make[5]: Entering directory '/«PKGBUILDDIR»/build/src/backend/postmaster' [...] gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security-fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -g -O2 -Wformat -Werror=format-security -I/usr/include/mit-krb5-fPIC -pie -DLINUX_OOM_SCORE_ADJ=0 -I../../../src/include -I/«PKGBUILDDIR»/build/../src/include -D_FORTIFY_SOURCE=2-D_GNU_SOURCE -I/usr/include/libxml2 -I/usr/include/tcl8.6 -c -o bgworker.o /«PKGBUILDDIR»/build/../src/backend/postmaster/bgworker.c /tmp/cc4j88on.s: Assembler messages: /tmp/cc4j88on.s:952: Error: unknown opcode `rmb' as: BFD (GNU Binutils for Debian) 2.25 internal error, aborting at ../../gas/write.c line 603 in size_seg as: Please report this bug. <builtin>: recipe for target 'bgworker.o' failed make[5]: *** [bgworker.o] Error 1 There's a proposed patch: https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;msg=5;bug=756368;filename=alpha-fix-read-memory-barrier.patch Christoph Berg -- Senior Berater, Tel.: +49 (0)21 61 / 46 43-187 credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209 Hohenzollernstr. 133, 41061 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer pgp fingerprint: 5C48 FE61 57F4 9179 5970 87C6 4C5A 6BAB 12D2 A7AE
Attachment
On 08/25/2015 06:16 AM, Christoph Berg wrote: > Hi, > > >From the Debian ports buildd: > > https://buildd.debian.org/status/fetch.php?pkg=postgresql-9.4&arch=alpha&ver=9.4.4-1&stamp=1434132509 > > make[5]: Entering directory '/«PKGBUILDDIR»/build/src/backend/postmaster' > [...] > gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -g -O2 -Wformat -Werror=format-security-I/usr/include/mit-krb5 -fPIC -pie -DLINUX_OOM_SCORE_ADJ=0 -I../../../src/include -I/«PKGBUILDDIR»/build/../src/include-D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -I/usr/include/libxml2 -I/usr/include/tcl8.6 -c-o bgworker.o /«PKGBUILDDIR»/build/../src/backend/postmaster/bgworker.c > /tmp/cc4j88on.s: Assembler messages: > /tmp/cc4j88on.s:952: Error: unknown opcode `rmb' > as: BFD (GNU Binutils for Debian) 2.25 internal error, aborting at ../../gas/write.c line 603 in size_seg > > as: Please report this bug. > > <builtin>: recipe for target 'bgworker.o' failed > make[5]: *** [bgworker.o] Error 1 needs a buildfarm animal. If we had one we'd presumably have caught this much earlier. cheers andrew
On 2015-08-25 08:29:18 -0400, Andrew Dunstan wrote: > needs a buildfarm animal. If we had one we'd presumably have caught this > much earlier. On the other hand, we dropped alpha support in 9.5, ...
On 08/25/2015 08:30 AM, Andres Freund wrote: > On 2015-08-25 08:29:18 -0400, Andrew Dunstan wrote: >> needs a buildfarm animal. If we had one we'd presumably have caught this >> much earlier. > On the other hand, we dropped alpha support in 9.5, ... Oh, I missed that. Sorry for the noise. cheers andrew
On 2015-08-25 08:57, Andrew Dunstan wrote: > On 08/25/2015 08:30 AM, Andres Freund wrote: > > On 2015-08-25 08:29:18 -0400, Andrew Dunstan wrote: > >> needs a buildfarm animal. If we had one we'd presumably have caught this > >> much earlier. > > On the other hand, we dropped alpha support in 9.5, ... > Oh, I missed that. Sorry for the noise. I've been meaning to report this myself. In the 4 years that that particular line has been there, not once had anyone else run into it on Gentoo until a couple months ago. And it isn't a case of end users missing it as we have arch testers that test packages before marking them suitable for public consumption. Alpha is one of the arches. As for the dropped support, has the Alpha specific code been ripped out? Would it still presumably run on Alpha?
Aaron W. Swenson wrote: > I've been meaning to report this myself. > > In the 4 years that that particular line has been there, not once had > anyone else run into it on Gentoo until a couple months ago. > > And it isn't a case of end users missing it as we have arch testers > that test packages before marking them suitable for public consumption. > > Alpha is one of the arches. This means that not once has anybody compiled in an Alpha in 4 years. > As for the dropped support, has the Alpha specific code been ripped > out? Would it still presumably run on Alpha? Yes, code has been ripped out. I would assume that it doesn't build at all anymore, but maybe what happens is you get spinlocks emulated with semaphores and it's only horribly slow. See http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=a6d488cb538c8761658f0f7edfc40cecc8c29f2d -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Aaron W. Swenson wrote: >> In the 4 years that that particular line has been there, not once had >> anyone else run into it on Gentoo until a couple months ago. >> And it isn't a case of end users missing it as we have arch testers >> that test packages before marking them suitable for public consumption. >> Alpha is one of the arches. > This means that not once has anybody compiled in an Alpha in 4 years. Well, strictly speaking, there were no uses of pg_read_barrier until 9.4. However, pg_write_barrier (which used "wmb") was in use since 9.2; so unless you're claiming your assembler knows wmb but not rmb, the code's failed to compile for Alpha since 9.2. >> As for the dropped support, has the Alpha specific code been ripped >> out? Would it still presumably run on Alpha? > Yes, code has been ripped out. I would assume that it doesn't build at > all anymore, but maybe what happens is you get spinlocks emulated with > semaphores and it's only horribly slow. The whole business about laxer-than-average memory coherency gives me the willies, though. It's fairly likely that PG has never worked right on multi-CPU Alphas. regards, tom lane
On 2015-08-25 15:43:12 -0400, Aaron W. Swenson wrote: > As for the dropped support, has the Alpha specific code been ripped > out? Would it still presumably run on Alpha? I'm pretty sure that postgres hasn't run correctly under concurrency on alpha for a long while. The lax cache coherency makes developing concurrent code hard. Since there are rarely, if ever, people testing postgres on alpha under load it's nigh on impossible to verify anything. Having to adhere to a more complicated memory model than for any other architecture isn't worth it, since there barely are users. Greetings, Andres Freund
Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > Aaron W. Swenson wrote: > >> In the 4 years that that particular line has been there, not once had > >> anyone else run into it on Gentoo until a couple months ago. > >> And it isn't a case of end users missing it as we have arch testers > >> that test packages before marking them suitable for public consumption. > >> Alpha is one of the arches. > > > This means that not once has anybody compiled in an Alpha in 4 years. > > Well, strictly speaking, there were no uses of pg_read_barrier until 9.4. > However, pg_write_barrier (which used "wmb") was in use since 9.2; so > unless you're claiming your assembler knows wmb but not rmb, the code's > failed to compile for Alpha since 9.2. Actually according to this http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15213-f98/doc/alpha-asm.pdf there is a wmb instruction but there is no rmb. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Tom Lane wrote: >> Well, strictly speaking, there were no uses of pg_read_barrier until 9.4. >> However, pg_write_barrier (which used "wmb") was in use since 9.2; so >> unless you're claiming your assembler knows wmb but not rmb, the code's >> failed to compile for Alpha since 9.2. > Actually according to this > http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15213-f98/doc/alpha-asm.pdf > there is a wmb instruction but there is no rmb. Oh really? If rmb were a figment of someone's imagination, it would explain the build failure (although not why nobody's reported it till now). It'd be easy enough to s/rmb/mb/ in 9.4 ... but not sure it's worth the trouble, since we're desupporting Alpha as of 9.5 anyway. If the effective desupport date is 9.4 instead, how much difference does that make? regards, tom lane
On Tue, Aug 25, 2015 at 06:09:17PM -0400, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > Tom Lane wrote: > >> Well, strictly speaking, there were no uses of pg_read_barrier until 9.4. > >> However, pg_write_barrier (which used "wmb") was in use since 9.2; so > >> unless you're claiming your assembler knows wmb but not rmb, the code's > >> failed to compile for Alpha since 9.2. > > > Actually according to this > > http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15213-f98/doc/alpha-asm.pdf > > there is a wmb instruction but there is no rmb. Exactly, as I had explained in the Debian bug report [1] over a year ago. > Oh really? If rmb were a figment of someone's imagination, it would > explain the build failure (although not why nobody's reported it till > now). I reported the failure to build on Alpha, with an explanation and a patch to fix it, to the Debian package maintainers over a year ago, and within about of a month of version 9.4 being uploaded to Debian. My recollection is that prior versions (9.2 and 9.3) compiled on Alpha so the use of the wrong barrier, and the fix, was in fact reported in a timely fashion following the first reasonable chance to observe the problem. It has been built and running at Debian-Ports for over a year now as I uploaded the fixed version to the Alpha unreleased distribution. > It'd be easy enough to s/rmb/mb/ in 9.4 ... but not sure it's worth > the trouble, since we're desupporting Alpha as of 9.5 anyway. That is disappointing to hear. Why is that? It is still in use on Alpha. What is the maintenance load for keeping the Alpha arch specific code? > If the effective desupport date is 9.4 instead, It's not. The fixed and built 9.4 version was uploaded to Debian-Ports Alpha (in the unreleased distribution) and has been in use for over a year. Regards, Michael. [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=756368
Michael Cree wrote: > On Tue, Aug 25, 2015 at 06:09:17PM -0400, Tom Lane wrote: > > Oh really? If rmb were a figment of someone's imagination, it would > > explain the build failure (although not why nobody's reported it till > > now). > > I reported the failure to build on Alpha, with an explanation and a > patch to fix it, to the Debian package maintainers over a year ago, > and within about of a month of version 9.4 being uploaded to Debian. It's a pretty disappointing packaging process failure that the bug report wasn't sent to upstream immediately, rather than waiting for a whole year. That would have made a lot less likely that the removal of the port would have passed muster in the first place. Supposedly we were only removing stuff that was pretty clearly dead. > It has been built and running at Debian-Ports for over a year now as > I uploaded the fixed version to the Alpha unreleased distribution. Has this been battle-tested under high load in multi-core servers? Because based on other comments in this and other threads, it seems likely that the port is subtly broken. > > It'd be easy enough to s/rmb/mb/ in 9.4 ... but not sure it's worth > > the trouble, since we're desupporting Alpha as of 9.5 anyway. > > That is disappointing to hear. Why is that? It is still in use on > Alpha. What is the maintenance load for keeping the Alpha arch > specific code? The amount of code that was removed by the commit isn't all that much: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=a6d488cb538c8761658f0f7edfc40cecc8c29f2d but there's been rather a lot of work after that to add support for atomic primitives as well as barriers, which would presumably not trivial to implement and test on Alpha. Someone would have to volunteer to writing, testing and maintaining that code. A buildfarm machine would be mandatory, too. I'm under the impression that Alpha machines are no longer being built, so I'm doubtful that this would be a good use of anybody's time. > > If the effective desupport date is 9.4 instead, > > It's not. The fixed and built 9.4 version was uploaded to Debian-Ports > Alpha (in the unreleased distribution) and has been in use for over a > year. I think we could apply the bugfix to 9.4, but this doesn't help with 9.5. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Michael Cree <mcree@orcon.net.nz> writes: > On Tue, Aug 25, 2015 at 06:09:17PM -0400, Tom Lane wrote: >> It'd be easy enough to s/rmb/mb/ in 9.4 ... but not sure it's worth >> the trouble, since we're desupporting Alpha as of 9.5 anyway. > That is disappointing to hear. Why is that? It is still in use on > Alpha. What is the maintenance load for keeping the Alpha arch > specific code? The core problem is that Alpha's unusually lax memory coherency model creates design and testing problems we face with no other architecture. We're not really excited about carrying that burden for a legacy architecture that isn't competitive in the modern world. Even if we were, it's completely impractical to maintain such an unusual port when there is no representative of the architecture in our buildfarm (http://buildfarm.postgresql.org/cgi-bin/show_status.pl). It's worth pointing out that had there been one, we would have noticed the rmb problem long before 9.4 ever shipped. I do not know anything about the prevalence of multi-CPU Alpha machines. If they're thin on the ground compared to single-CPU, maybe we could just document that we only support the latter, and not worry too much about the memory coherency issues. But in any case, without a commitment from somebody to maintain an Alpha buildfarm member, we will absolutely not consider reviving that port. regards, tom lane
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Michael Cree wrote: >> That is disappointing to hear. Why is that? It is still in use on >> Alpha. What is the maintenance load for keeping the Alpha arch >> specific code? > The amount of code that was removed by the commit isn't all that much: > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=a6d488cb538c8761658f0f7edfc40cecc8c29f2d > but there's been rather a lot of work after that to add support for > atomic primitives as well as barriers, which would presumably not > trivial to implement and test on Alpha. Someone would have to volunteer > to writing, testing and maintaining that code. As far as that goes, we do have fallback atomics code that's supposed to work on anything (and not be unusably slow). So in principle, resurrecting the Alpha spinlock code ought to be enough to get back to the previous level of support. Coding Alpha atomic primitives would likely be worth doing, if there's somebody out there who's excited enough to take it on; but that could happen later, and incrementally. > A buildfarm machine would be mandatory, too. That, however, is not negotiable. regards, tom lane
On 2015-08-26 12:49:46 -0400, Tom Lane wrote: > As far as that goes, we do have fallback atomics code that's supposed to > work on anything (and not be unusably slow). So in principle, > resurrecting the Alpha spinlock code ought to be enough to get back to the > previous level of support. Coding Alpha atomic primitives would likely > be worth doing, if there's somebody out there who's excited enough to take > it on; but that could happen later, and incrementally. Actually, on linux and most other OSs it should just use the generic gcc based implementation and be pretty close to optimal. The only thing it'd need would be to define the memory barriers, so the fallback implementation of those isn't used. But I really strongly object to re-introducing alpha support. Having to care about data dependency barriers is a huge pita, and it complicates code for everyone. And we'd have to investigate a lot of code to actually make it work reliably. For what benefit? > > A buildfarm machine would be mandatory, too. > > That, however, is not negotiable. If we really were to re-introduce this we'd need an actual developer machine to run tests against.
Andres Freund <andres@anarazel.de> writes: > But I really strongly object to re-introducing alpha support. Having to > care about data dependency barriers is a huge pita, and it complicates > code for everyone. And we'd have to investigate a lot of code to > actually make it work reliably. For what benefit? I hear you, but that's only an issue for multi-CPU machines no? If we just say "we doubt this works on multi-CPU Alphas, if it breaks you get to keep both pieces", then we're basically at the same place we were before. To be clear: I don't want to do the work you're speaking of, either. But if we have people who were successfully using PG on Alphas before, the coherency issues must not have been a problem for them. Can't we just (continue to) ignore the issue? > If we really were to re-introduce this we'd need an actual developer > machine to run tests against. I would certainly expect that we'd insist on active support from the Alpha community; we're not going to continue to do this in an open-loop fashion. regards, tom lane
On Wed, Aug 26, 2015 at 12:49:46PM -0400, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > A buildfarm machine would be mandatory, too. > > That, however, is not negotiable. Right. I think the still-open question around PostgreSQL on Alpha is whether 9.1 through 9.4 are meaningfully supported there. Step one for anyone interested in Alpha support is to activate a buildfarm member covering that range of versions. Without that, the PostgreSQL community is just listening for bug fix contributions. On Wed, Aug 26, 2015 at 01:34:38PM -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > But I really strongly object to re-introducing alpha support. Having to > > care about data dependency barriers is a huge pita, and it complicates > > code for everyone. And we'd have to investigate a lot of code to > > actually make it work reliably. For what benefit? > > I hear you, but that's only an issue for multi-CPU machines no? If we > just say "we doubt this works on multi-CPU Alphas, if it breaks you get to > keep both pieces", then we're basically at the same place we were before. > > To be clear: I don't want to do the work you're speaking of, either. > But if we have people who were successfully using PG on Alphas before, > the coherency issues must not have been a problem for them. Can't we > just (continue to) ignore the issue? The landscape changed with the 9.5 cycle's push to use more lock-free algorithms. Dropping Alpha support simplified review for those algorithms. True, we could ignore Alpha for review purposes and accept unstudied damage to reliability on Alpha. To some extent, that characterizes any platform whose test reports don't reach us. It's different when we know Alpha has special needs and we make changes in the area of those needs without even attempting to meet them. We made a decision to instead break compatibility explicitly, and I don't think this thread has impugned that decision. As it is, we've implicitly prepared to ship Alpha-supporting PostgreSQL 9.4 until 2019, by which time the newest Alpha hardware will be 15 years old. Computer museums would be our only audience for continued support. I do have a sentimental weakness for computer museums, but not at the price of drag on important performance work. nm
Re: Andrew Dunstan 2015-08-25 <55DC5F9E.60601@dunslane.net> > >gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -g -O2 -Wformat -Werror=format-security-I/usr/include/mit-krb5 -fPIC -pie -DLINUX_OOM_SCORE_ADJ=0 -I../../../src/include -I/«PKGBUILDDIR»/build/../src/include-D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -I/usr/include/libxml2 -I/usr/include/tcl8.6 -c-o bgworker.o /«PKGBUILDDIR»/build/../src/backend/postmaster/bgworker.c > >/tmp/cc4j88on.s: Assembler messages: > >/tmp/cc4j88on.s:952: Error: unknown opcode `rmb' > >as: BFD (GNU Binutils for Debian) 2.25 internal error, aborting at ../../gas/write.c line 603 in size_seg > > > needs a buildfarm animal. If we had one we'd presumably have caught this > much earlier. In the meantime, I've added this patch to the 9.4 Debian package, and the build including check-world succeeds: https://buildd.debian.org/status/fetch.php?pkg=postgresql-9.4&arch=alpha&ver=9.4.4-2&stamp=1440797195 It'd be nice if the patch could get applied to 9.4 and earlier. Thanks, Christoph -- Senior Berater, Tel.: +49 (0)21 61 / 46 43-187 credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209 Hohenzollernstr. 133, 41061 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer pgp fingerprint: 5C48 FE61 57F4 9179 5970 87C6 4C5A 6BAB 12D2 A7AE
On Wed, Aug 26, 2015 at 09:19:09PM -0400, Noah Misch wrote: > As it is, we've implicitly prepared to ship Alpha-supporting > PostgreSQL 9.4 until 2019, by which time the newest Alpha hardware > will be 15 years old. Computer museums would be our only audience > for continued support. I do have a sentimental weakness for > computer museums, but not at the price of drag on important > performance work. +1000 I think we need to take realistic stock of what we're doing and what we should require. At a minimum, we should de-support every platform on which literally no new deployments will ever happen. I'm looking specifically at you, HPUX, and I could make a pretty good case for the idea that we can relegate 32-bit platforms to the ash heap of history, at least on the server side. Then, there's the question of rotating media. Given the givens, we ought to be drawing up plans for the cases where we might consider supporting them, but those would need to be zero-based plans, i.e. the starting point would be that we don't support them, and arguments would have to be made affirmatively to support them for some specific, demonstrable use case. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On 2015-08-29 08:32:29 -0700, David Fetter wrote: > and I could make a pretty good > case for the idea that we can relegate 32-bit platforms to the ash > heap of history, at least on the server side. Don't see the point, it doesn't cost us very much. > Then, there's the question of rotating media. Given the givens, we > ought to be drawing up plans for the cases where we might consider > supporting them, but those would need to be zero-based plans, i.e. the > starting point would be that we don't support them, and arguments > would have to be made affirmatively to support them for some specific, > demonstrable use case. We don't particularly care either way, so I don't see why we'd add/drop support for anything here.
Christoph Berg <christoph.berg@credativ.de> writes: > It'd be nice if the patch could get applied to 9.4 and earlier. I've pushed that patch into 9.4. Barring somebody stepping forward with an offer of a buildfarm member and any other necessary developer support, I do not think there will be any further consideration of reversing our decision to drop Alpha support as of 9.5. regards, tom lane
Re: Michael Cree 2015-08-26 <20150826052530.GA4256@tower> > I reported the failure to build on Alpha, with an explanation and a > patch to fix it, to the Debian package maintainers over a year ago, > and within about of a month of version 9.4 being uploaded to Debian. > > My recollection is that prior versions (9.2 and 9.3) compiled on > Alpha so the use of the wrong barrier, and the fix, was in fact > reported in a timely fashion following the first reasonable chance to > observe the problem. > > It has been built and running at Debian-Ports for over a year now as > I uploaded the fixed version to the Alpha unreleased distribution. Hi Michael, (I've discovered this branch of this thread only now, I got removed from CC.) Sorry for letting that rot for so long - I'd blame the Debian infrastructure for not showing ports information in the usual places. I've really only discovered the problem because buildd.debian.org is now showing the non-main architectures as well. (Of course we could just have looked at the bug report...) I guess we should look into making that even more visible. Is there a list of packages that have ports-only patches applied which we could use to make maintainers aware via ddpo/pts/tracker? Having a porter box available would help as well. > > It'd be easy enough to s/rmb/mb/ in 9.4 ... but not sure it's worth > > the trouble, since we're desupporting Alpha as of 9.5 anyway. > > That is disappointing to hear. Why is that? It is still in use on > Alpha. What is the maintenance load for keeping the Alpha arch > specific code? Fwiw I'd be curious to see if 9.5 still works using the generic primitives, but atm it's blocking on perl5.20: Dependency installability problem for postgresql-9.5 on alpha: postgresql-9.5 build-depends on: - alpha:libipc-run-perl alpha:libipc-run-perl depends on: - alpha:libio-pty-perl alpha:libio-pty-perl depends on missing: - alpha:perlapi-5.20.0 Christoph -- cb@df7cb.de | http://www.df7cb.de/
David Fetter <david@fetter.org> writes: > At a minimum, we should de-support every platform on which literally > no new deployments will ever happen. > I'm looking specifically at you, HPUX, and I could make a pretty good > case for the idea that we can relegate 32-bit platforms to the ash > heap of history, at least on the server side. This wasn't responded to very much, but I wanted to put on record that I don't agree with the concept. There are several reasons: 1. "Run on every platform you can" is in the DNA of this and just about every other successful open-source project. You don't want to drive away potential users by not supporting their platform. If they're still getting good use out of an old OS or non-mainstream architecture, who are we to tell them not to? 2. Even if a particular platform is no longer a credible target for production deployments, it can be a useful test case to ensure that we don't get frozen into a narrow "FooOS on x86_64 is the only case worth considering" straitjacket. Software monocultures are bad news; they tend not to adapt very well when the world changes. So for instance I'm reluctant to shut down pademelon, even though its compiler is old enough to vote, because it's one of not too darn many buildfarm animals whose compilers are not gcc or derivatives. We need cases like that to keep us from building in gcc-isms. In short, supporting old platforms is one of the ways that we stay flexible enough to be able to support new platforms in the future. 3. I see no reason to desupport platforms when we don't gain anything by it. In the case of Alpha, it's pretty clear what we gain: we don't have to worry about its unlike-anything-else memory coherency model. (I'm not very worried that future platforms will adopt that idea, either.) And the lack of any support from its remaining user community tilts the scales pretty heavily against it. I'll be happy to drop testing on HPUX 10.20, or the ancient OS X versions my other buildfarm critters run, the minute there is some feature we have a clear need for that one of them doesn't have. But I don't think it's desirable to cut anything off as long as it's still able to run a buildfarm member. I think those critters are still capable of catching unexpected portability issues that might affect more-viable platforms too. A useful comparison point is the testing Greg Stark did recently for VAX. Certainly no-one's ever again going to try to get useful work done with Postgres on a VAX, but that still taught us some good things about unnecessary IEEE-floating-point dependencies that had snuck into the code. Someday, that might be important; IEEE 754 won't be the last word on float arithmetic forever. As an example of a desupport proposal that I think *is* well-founded, see my nearby message <27975.1440961181@sss.pgh.pa.us>. regards, tom lane
On Mon, Aug 31, 2015 at 8:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Just by the way, there is at least one example of a non-IEEE floating point format supported by a current production compiler and hardware: IBM XL C on z/OS (and possibly other platforms) can use either IEEE or IBM's hex float format, depending on a compiler option.
https://en.wikipedia.org/wiki/IBM_Floating_Point_Architecture
A useful comparison point is the testing Greg Stark did recently for VAX.
Certainly no-one's ever again going to try to get useful work done with
Postgres on a VAX, but that still taught us some good things about
unnecessary IEEE-floating-point dependencies that had snuck into the code.
Someday, that might be important; IEEE 754 won't be the last word on
float arithmetic forever.
Just by the way, there is at least one example of a non-IEEE floating point format supported by a current production compiler and hardware: IBM XL C on z/OS (and possibly other platforms) can use either IEEE or IBM's hex float format, depending on a compiler option.
https://en.wikipedia.org/wiki/IBM_Floating_Point_Architecture
Thomas Munro
http://www.enterprisedb.com
http://www.enterprisedb.com
On Sun, Aug 30, 2015 at 4:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > David Fetter <david@fetter.org> writes: >> At a minimum, we should de-support every platform on which literally >> no new deployments will ever happen. >> I'm looking specifically at you, HPUX, and I could make a pretty good >> case for the idea that we can relegate 32-bit platforms to the ash >> heap of history, at least on the server side. > > This wasn't responded to very much, but I wanted to put on record > that I don't agree with the concept. There are several reasons: > > 1. "Run on every platform you can" is in the DNA of this and just about > every other successful open-source project. You don't want to drive away > potential users by not supporting their platform. If they're still > getting good use out of an old OS or non-mainstream architecture, who are > we to tell them not to? > > 2. Even if a particular platform is no longer a credible target for > production deployments, it can be a useful test case to ensure that we > don't get frozen into a narrow "FooOS on x86_64 is the only case worth > considering" straitjacket. Software monocultures are bad news; they tend > not to adapt very well when the world changes. So for instance I'm > reluctant to shut down pademelon, even though its compiler is old enough > to vote, because it's one of not too darn many buildfarm animals whose > compilers are not gcc or derivatives. We need cases like that to keep us > from building in gcc-isms. In short, supporting old platforms is one of > the ways that we stay flexible enough to be able to support new platforms > in the future. > > 3. I see no reason to desupport platforms when we don't gain anything by > it. In the case of Alpha, it's pretty clear what we gain: we don't have > to worry about its unlike-anything-else memory coherency model. (I'm not > very worried that future platforms will adopt that idea, either.) And the > lack of any support from its remaining user community tilts the scales > pretty heavily against it. I'll be happy to drop testing on HPUX 10.20, > or the ancient OS X versions my other buildfarm critters run, the minute > there is some feature we have a clear need for that one of them doesn't > have. But I don't think it's desirable to cut anything off as long as > it's still able to run a buildfarm member. I think those critters are > still capable of catching unexpected portability issues that might affect > more-viable platforms too. I agree with all this. I doubt there is a big problem with supporting Alpha apart from lock-free algorithms. If critical sections are protected with locks, I don't believe Alpha requires any special handling. However, lock-free algorithms are important. Given a choice between continuing to introduce more of them were as necessary to improve scalability and performance, and continuing to support Alpha, I doubt anyone here is prepared to vote for the latter. Even if Alpha servers were readily available both in the buildfarm and for developer testing, I suspect most people here would be skeptical about requiring future lock-free algorithms to support Alpha. But since they aren't available, imposing that requirement isn't even a realistic option. The best argument for continuing to support Alpha is probably that Linux does. I don't know how they do that. Presumably most Linux kernel developers don't have access to Alpha hardware, which makes me wonder how they avoid missing read_barrier_depends() in places where it is needed (since it's a no-op everywhere else). Considering that Linux's use of lock-free algorithms is vastly more extensive than ours, it would seem awfully difficult to avoid introducing bugs of that type. I previously argued that, rather than changing lwlock.c to use atomics categorically (and falling back to atomics emulation when real atomics are not available), we should have two implementations, one based on atomics and the other relying only on spinlocks. I believe if we'd done that, we would be in a position to continue supporting Alpha and whatever other weird stuff comes up in the future, because, again, I think lock-based algorithms should be solid everywhere. Once we didn't take that path, I think the die was cast. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2015-09-01 14:40:36 -0400, Robert Haas wrote: > I doubt there is a big problem with supporting Alpha apart from > lock-free algorithms. Note that we've had lock-free algorithms for years. E.g. the changecount stuff in pgstat.c. > The best argument for continuing to support Alpha is probably that > Linux does. Not sure why that's an argument? I mean linux supports architectures without an MMU, but we'll surely never? > I don't know how they do that. Presumably most Linux > kernel developers don't have access to Alpha hardware, which makes me > wonder how they avoid missing read_barrier_depends() in places where > it is needed (since it's a no-op everywhere else). I think they do miss it regularly from what I'm skimming on these lists. > Considering that Linux's use of lock-free algorithms is vastly more > extensive than ours, it would seem awfully difficult to avoid > introducing bugs of that type. In a lot of cases they've embedded the read_barrier_depends() in macros. E.g. when doing concurrent stuff involving rcu you're only ever supposed to dereference memory using rcu_dereference() which on moset architectures is just a volatile cast to force a read from memory, but includes a smp_read_barrier_depends(). There's a bunch of other similar cases. > I previously argued that, rather than changing lwlock.c to use atomics > categorically (and falling back to atomics emulation when real atomics > are not available), we should have two implementations, one based on > atomics and the other relying only on spinlocks. I still think that'd have been a utter horrible mistake. lwlock.c is already complicated enough. That it actually ends up being faster when implemented using atomics implementation rather than spinlocks over the full perdiod doesn't hurt either. > I believe if we'd done that, we would be in a position to continue > supporting Alpha and whatever other weird stuff comes up in the > future, because, again, I think lock-based algorithms should be solid > everywhere. Once we didn't take that path, I think the die was cast. I'm not following how those are related - the relevant pointer chasing in lwlock.c should actually be safe on alpha (as done under a spinlock). And whether lwlocks is implemented primarily using spinlocks or atomics doesn't have a bearing on the data dependency barriers? There might be a data dependency missing somewhere, but ...? Since alpha has easy to use atomics support it'd actually have ended up using the gcc generics and used the atomics implementation anyway. Greetings, Andres Freund
Robert Haas <robertmhaas@gmail.com> writes: > The best argument for continuing to support Alpha is probably that > Linux does. I don't know how they do that. My sneaking suspicion is that they don't very well. In particular, unless I misunderstand things fundamentally, the coherency issues would be invisible without a multi-CPU machine, and there are probably not that many multi-CPU Alphas still alive. The kernel could well be full of bugs that don't manifest on single-CPU Alphas. I also note that nominal support is quite different from being production grade. Red Hat, for instance, never supported Alpha hardware (at least not while I was there), and I doubt that any other commercial Linux support provider has supported it in a long time either. If there were bugs, how many people would notice or care? regards, tom lane
On Tue, Sep 1, 2015 at 2:54 PM, Andres Freund <andres@anarazel.de> wrote: > On 2015-09-01 14:40:36 -0400, Robert Haas wrote: >> I doubt there is a big problem with supporting Alpha apart from >> lock-free algorithms. > > Note that we've had lock-free algorithms for years. E.g. the changecount > stuff in pgstat.c. Hmm, true. I think that stuff is probably missing some barriers that are technically required even on mainstream platforms. But the races are narrow, so you may not see any problem in practice, and if you do, the worst that'll happen is some junk in pg_stat_activity. >> The best argument for continuing to support Alpha is probably that >> Linux does. > > Not sure why that's an argument? I mean linux supports architectures > without an MMU, but we'll surely never? I'm just saying that, we're arguing that we can't do it, but they're doing it, so presumably we could find a way if we were really determined. I'm not saying that it's a good use of time, but Linux seems to think it is. >> I previously argued that, rather than changing lwlock.c to use atomics >> categorically (and falling back to atomics emulation when real atomics >> are not available), we should have two implementations, one based on >> atomics and the other relying only on spinlocks. > > I still think that'd have been a utter horrible mistake. lwlock.c is > already complicated enough. That it actually ends up being faster when > implemented using atomics implementation rather than spinlocks over the > full perdiod doesn't hurt either. I don't know what a "perdiod" is. >> I believe if we'd done that, we would be in a position to continue >> supporting Alpha and whatever other weird stuff comes up in the >> future, because, again, I think lock-based algorithms should be solid >> everywhere. Once we didn't take that path, I think the die was cast. > > I'm not following how those are related - the relevant pointer chasing > in lwlock.c should actually be safe on alpha (as done under a > spinlock). And whether lwlocks is implemented primarily using spinlocks > or atomics doesn't have a bearing on the data dependency barriers? There > might be a data dependency missing somewhere, but ...? > > Since alpha has easy to use atomics support it'd actually have ended up > using the gcc generics and used the atomics implementation anyway. If all of your concurrency control looks like this: SpinLockAcquire(&mutex); // barrier // do stuff SpinLockRelease(&mutex); // also a barrier ...then I think it doesn't matter what wonky stuff Alpha does. Mutual exclusion is mutual exclusion, full stop. When you start doing things that use pg_atomic_uint32, or, as you mention, the st_changecount protocol, you are now potentially relying on memory-ordering semantics that may vary among platforms. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Sep 1, 2015 at 2:54 PM, Andres Freund <andres@anarazel.de> wrote: >> On 2015-09-01 14:40:36 -0400, Robert Haas wrote: >>> The best argument for continuing to support Alpha is probably that >>> Linux does. >> Not sure why that's an argument? I mean linux supports architectures >> without an MMU, but we'll surely never? > I'm just saying that, we're arguing that we can't do it, but they're > doing it, so presumably we could find a way if we were really > determined. I'm not saying that it's a good use of time, but Linux > seems to think it is. I think we've probably beat this to death. Nobody here believes that it's sane to try to support Alpha without access to hardware, and no offer of hardware has been forthcoming. If one were to materialize, we could usefully have a debate about whether it's worth doing ... regards, tom lane
On Tue, Sep 1, 2015 at 1:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I think we've probably beat this to death. Nobody here believes that > it's sane to try to support Alpha without access to hardware, and no > offer of hardware has been forthcoming. If one were to materialize, > we could usefully have a debate about whether it's worth doing ... I agree. I can't believe how seriously Alpha support has been debated here. I think that the Linux implementation is simply very limited, or broken. Andres mentioned Linux supporting systems without MMUs/paging. I imagine this was based on this paragraph in the Linux README: Linux is easily portable to most general-purpose 32- or 64-bit architectures as long as they have a paged memory managementunit (PMMU) and a port of the GNU C compiler (gcc) (part of The GNU Compiler Collection, GCC). Linux has also beenported to a number of architectures without a PMMU, although functionality is then obviously somewhat limited. I'm not sure how or to what degree these systems lacking an MMU have limited support, but I think it's fair to speculate that Alpha may similarly have severe limitations, or even severe bugs (just like Postgres 9.4's Alpha support). -- Peter Geoghegan
On 09/01/2015 01:18 PM, Peter Geoghegan wrote: > On Tue, Sep 1, 2015 at 1:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I think we've probably beat this to death. Nobody here believes that >> it's sane to try to support Alpha without access to hardware, and no >> offer of hardware has been forthcoming. If one were to materialize, >> we could usefully have a debate about whether it's worth doing ... > > I agree. I can't believe how seriously Alpha support has been debated > here. I think that the Linux implementation is simply very limited, or > broken. It isn't even made any more. Alpha is dead except for obscure hobbyists. We aren't Debian, we should be much more stringent on the platforms we support. Sincerely, JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Centered full stack support, consulting and development. Announcing "I'm offended" is basically telling the world you can't control your own emotions, so everyone else should do it for you.
On Tue, Sep 1, 2015 at 4:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Tue, Sep 1, 2015 at 2:54 PM, Andres Freund <andres@anarazel.de> wrote: >>> On 2015-09-01 14:40:36 -0400, Robert Haas wrote: >>>> The best argument for continuing to support Alpha is probably that >>>> Linux does. > >>> Not sure why that's an argument? I mean linux supports architectures >>> without an MMU, but we'll surely never? > >> I'm just saying that, we're arguing that we can't do it, but they're >> doing it, so presumably we could find a way if we were really >> determined. I'm not saying that it's a good use of time, but Linux >> seems to think it is. > > I think we've probably beat this to death. Nobody here believes that > it's sane to try to support Alpha without access to hardware, and no > offer of hardware has been forthcoming. If one were to materialize, > we could usefully have a debate about whether it's worth doing ... Yep. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company