Thread: 9.4 broken on alpha

9.4 broken on alpha

From
Christoph Berg
Date:
Hi,

>From the Debian ports buildd:

https://buildd.debian.org/status/fetch.php?pkg=postgresql-9.4&arch=alpha&ver=9.4.4-1&stamp=1434132509

make[5]: Entering directory '/«PKGBUILDDIR»/build/src/backend/postmaster'
[...]
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute
-Wformat-security-fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -g -O2 -Wformat -Werror=format-security
-I/usr/include/mit-krb5-fPIC -pie -DLINUX_OOM_SCORE_ADJ=0 -I../../../src/include -I/«PKGBUILDDIR»/build/../src/include
-D_FORTIFY_SOURCE=2-D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include/tcl8.6  -c -o bgworker.o
/«PKGBUILDDIR»/build/../src/backend/postmaster/bgworker.c
/tmp/cc4j88on.s: Assembler messages:
/tmp/cc4j88on.s:952: Error: unknown opcode `rmb'
as: BFD (GNU Binutils for Debian) 2.25 internal error, aborting at ../../gas/write.c line 603 in size_seg

as: Please report this bug.

<builtin>: recipe for target 'bgworker.o' failed
make[5]: *** [bgworker.o] Error 1

There's a proposed patch:

https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;msg=5;bug=756368;filename=alpha-fix-read-memory-barrier.patch

Christoph Berg
--
Senior Berater, Tel.: +49 (0)21 61 / 46 43-187
credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970  87C6 4C5A 6BAB 12D2 A7AE

Attachment

Re: 9.4 broken on alpha

From
Andrew Dunstan
Date:

On 08/25/2015 06:16 AM, Christoph Berg wrote:
> Hi,
>
> >From the Debian ports buildd:
>
> https://buildd.debian.org/status/fetch.php?pkg=postgresql-9.4&arch=alpha&ver=9.4.4-1&stamp=1434132509
>
> make[5]: Entering directory '/«PKGBUILDDIR»/build/src/backend/postmaster'
> [...]
> gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -g -O2 -Wformat
-Werror=format-security-I/usr/include/mit-krb5 -fPIC -pie -DLINUX_OOM_SCORE_ADJ=0 -I../../../src/include
-I/«PKGBUILDDIR»/build/../src/include-D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include/tcl8.6
-c-o bgworker.o /«PKGBUILDDIR»/build/../src/backend/postmaster/bgworker.c 
> /tmp/cc4j88on.s: Assembler messages:
> /tmp/cc4j88on.s:952: Error: unknown opcode `rmb'
> as: BFD (GNU Binutils for Debian) 2.25 internal error, aborting at ../../gas/write.c line 603 in size_seg
>
> as: Please report this bug.
>
> <builtin>: recipe for target 'bgworker.o' failed
> make[5]: *** [bgworker.o] Error 1


needs a buildfarm animal. If we had one we'd presumably have caught this
much earlier.


cheers

andrew





Re: 9.4 broken on alpha

From
Andres Freund
Date:
On 2015-08-25 08:29:18 -0400, Andrew Dunstan wrote:
> needs a buildfarm animal. If we had one we'd presumably have caught this
> much earlier.

On the other hand, we dropped alpha support in 9.5, ...



Re: 9.4 broken on alpha

From
Andrew Dunstan
Date:

On 08/25/2015 08:30 AM, Andres Freund wrote:
> On 2015-08-25 08:29:18 -0400, Andrew Dunstan wrote:
>> needs a buildfarm animal. If we had one we'd presumably have caught this
>> much earlier.
> On the other hand, we dropped alpha support in 9.5, ...



Oh, I missed that. Sorry for the noise.

cheers

andrew



Re: 9.4 broken on alpha

From
"Aaron W. Swenson"
Date:
On 2015-08-25 08:57, Andrew Dunstan wrote:
> On 08/25/2015 08:30 AM, Andres Freund wrote:
> > On 2015-08-25 08:29:18 -0400, Andrew Dunstan wrote:
> >> needs a buildfarm animal. If we had one we'd presumably have caught this
> >> much earlier.
> > On the other hand, we dropped alpha support in 9.5, ...
> Oh, I missed that. Sorry for the noise.

I've been meaning to report this myself.

In the 4 years that that particular line has been there, not once had
anyone else run into it on Gentoo until a couple months ago.

And it isn't a case of end users missing it as we have arch testers
that test packages before marking them suitable for public consumption.

Alpha is one of the arches.

As for the dropped support, has the Alpha specific code been ripped
out? Would it still presumably run on Alpha?

Re: 9.4 broken on alpha

From
Alvaro Herrera
Date:
Aaron W. Swenson wrote:

> I've been meaning to report this myself.
> 
> In the 4 years that that particular line has been there, not once had
> anyone else run into it on Gentoo until a couple months ago.
> 
> And it isn't a case of end users missing it as we have arch testers
> that test packages before marking them suitable for public consumption.
> 
> Alpha is one of the arches.

This means that not once has anybody compiled in an Alpha in 4 years.

> As for the dropped support, has the Alpha specific code been ripped
> out? Would it still presumably run on Alpha?

Yes, code has been ripped out.  I would assume that it doesn't build at
all anymore, but maybe what happens is you get spinlocks emulated with
semaphores and it's only horribly slow.  See
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=a6d488cb538c8761658f0f7edfc40cecc8c29f2d


-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: 9.4 broken on alpha

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Aaron W. Swenson wrote:
>> In the 4 years that that particular line has been there, not once had
>> anyone else run into it on Gentoo until a couple months ago.
>> And it isn't a case of end users missing it as we have arch testers
>> that test packages before marking them suitable for public consumption.
>> Alpha is one of the arches.

> This means that not once has anybody compiled in an Alpha in 4 years.

Well, strictly speaking, there were no uses of pg_read_barrier until 9.4.
However, pg_write_barrier (which used "wmb") was in use since 9.2; so
unless you're claiming your assembler knows wmb but not rmb, the code's
failed to compile for Alpha since 9.2.

>> As for the dropped support, has the Alpha specific code been ripped
>> out? Would it still presumably run on Alpha?

> Yes, code has been ripped out.  I would assume that it doesn't build at
> all anymore, but maybe what happens is you get spinlocks emulated with
> semaphores and it's only horribly slow.

The whole business about laxer-than-average memory coherency gives me the
willies, though.  It's fairly likely that PG has never worked right on
multi-CPU Alphas.
        regards, tom lane



Re: 9.4 broken on alpha

From
Andres Freund
Date:
On 2015-08-25 15:43:12 -0400, Aaron W. Swenson wrote:
> As for the dropped support, has the Alpha specific code been ripped
> out? Would it still presumably run on Alpha?

I'm pretty sure that postgres hasn't run correctly under concurrency on
alpha for a long while. The lax cache coherency makes developing
concurrent code hard. Since there are rarely, if ever, people testing
postgres on alpha under load it's nigh on impossible to verify anything.

Having to adhere to a more complicated memory model than for any other
architecture isn't worth it, since there barely are users.

Greetings,

Andres Freund



Re: 9.4 broken on alpha

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > Aaron W. Swenson wrote:
> >> In the 4 years that that particular line has been there, not once had
> >> anyone else run into it on Gentoo until a couple months ago.
> >> And it isn't a case of end users missing it as we have arch testers
> >> that test packages before marking them suitable for public consumption.
> >> Alpha is one of the arches.
> 
> > This means that not once has anybody compiled in an Alpha in 4 years.
> 
> Well, strictly speaking, there were no uses of pg_read_barrier until 9.4.
> However, pg_write_barrier (which used "wmb") was in use since 9.2; so
> unless you're claiming your assembler knows wmb but not rmb, the code's
> failed to compile for Alpha since 9.2.

Actually according to this
http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15213-f98/doc/alpha-asm.pdf
there is a wmb instruction but there is no rmb.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: 9.4 broken on alpha

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> Well, strictly speaking, there were no uses of pg_read_barrier until 9.4.
>> However, pg_write_barrier (which used "wmb") was in use since 9.2; so
>> unless you're claiming your assembler knows wmb but not rmb, the code's
>> failed to compile for Alpha since 9.2.

> Actually according to this
> http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15213-f98/doc/alpha-asm.pdf
> there is a wmb instruction but there is no rmb.

Oh really?  If rmb were a figment of someone's imagination, it would
explain the build failure (although not why nobody's reported it till
now).

It'd be easy enough to s/rmb/mb/ in 9.4 ... but not sure it's worth
the trouble, since we're desupporting Alpha as of 9.5 anyway.  If the
effective desupport date is 9.4 instead, how much difference does that
make?
        regards, tom lane



Re: 9.4 broken on alpha

From
Michael Cree
Date:
On Tue, Aug 25, 2015 at 06:09:17PM -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > Tom Lane wrote:
> >> Well, strictly speaking, there were no uses of pg_read_barrier until 9.4.
> >> However, pg_write_barrier (which used "wmb") was in use since 9.2; so
> >> unless you're claiming your assembler knows wmb but not rmb, the code's
> >> failed to compile for Alpha since 9.2.
> 
> > Actually according to this
> > http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15213-f98/doc/alpha-asm.pdf
> > there is a wmb instruction but there is no rmb.

Exactly, as I had explained in the Debian bug report [1] over a year
ago.

> Oh really?  If rmb were a figment of someone's imagination, it would
> explain the build failure (although not why nobody's reported it till
> now).

I reported the failure to build on Alpha, with an explanation and a
patch to fix it, to the Debian package maintainers over a year ago,
and within about of a month of version 9.4 being uploaded to Debian.

My recollection is that prior versions (9.2 and 9.3) compiled on
Alpha so the use of the wrong barrier, and the fix, was in fact
reported in a timely fashion following the first reasonable chance to
observe the problem.

It has been built and running at Debian-Ports for over a year now as
I uploaded the fixed version to the Alpha unreleased distribution.

> It'd be easy enough to s/rmb/mb/ in 9.4 ... but not sure it's worth
> the trouble, since we're desupporting Alpha as of 9.5 anyway.

That is disappointing to hear.  Why is that?  It is still in use on
Alpha.  What is the maintenance load for keeping the Alpha arch
specific code?

> If the effective desupport date is 9.4 instead,

It's not.  The fixed and built 9.4 version was uploaded to Debian-Ports
Alpha (in the unreleased distribution) and has been in use for over a
year.

Regards,
Michael.

[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=756368



Re: 9.4 broken on alpha

From
Alvaro Herrera
Date:
Michael Cree wrote:
> On Tue, Aug 25, 2015 at 06:09:17PM -0400, Tom Lane wrote:

> > Oh really?  If rmb were a figment of someone's imagination, it would
> > explain the build failure (although not why nobody's reported it till
> > now).
> 
> I reported the failure to build on Alpha, with an explanation and a
> patch to fix it, to the Debian package maintainers over a year ago,
> and within about of a month of version 9.4 being uploaded to Debian.

It's a pretty disappointing packaging process failure that the bug
report wasn't sent to upstream immediately, rather than waiting for a
whole year.  That would have made a lot less likely that the removal of
the port would have passed muster in the first place.  Supposedly we
were only removing stuff that was pretty clearly dead.

> It has been built and running at Debian-Ports for over a year now as
> I uploaded the fixed version to the Alpha unreleased distribution.

Has this been battle-tested under high load in multi-core servers?
Because based on other comments in this and other threads, it seems
likely that the port is subtly broken.

> > It'd be easy enough to s/rmb/mb/ in 9.4 ... but not sure it's worth
> > the trouble, since we're desupporting Alpha as of 9.5 anyway.
> 
> That is disappointing to hear.  Why is that?  It is still in use on
> Alpha.  What is the maintenance load for keeping the Alpha arch
> specific code?

The amount of code that was removed by the commit isn't all that much:
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=a6d488cb538c8761658f0f7edfc40cecc8c29f2d
but there's been rather a lot of work after that to add support for
atomic primitives as well as barriers, which would presumably not
trivial to implement and test on Alpha.  Someone would have to volunteer
to writing, testing and maintaining that code.  A buildfarm machine
would be mandatory, too.

I'm under the impression that Alpha machines are no longer being built,
so I'm doubtful that this would be a good use of anybody's time.

> > If the effective desupport date is 9.4 instead,
> 
> It's not.  The fixed and built 9.4 version was uploaded to Debian-Ports
> Alpha (in the unreleased distribution) and has been in use for over a
> year.

I think we could apply the bugfix to 9.4, but this doesn't help with
9.5.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: 9.4 broken on alpha

From
Tom Lane
Date:
Michael Cree <mcree@orcon.net.nz> writes:
> On Tue, Aug 25, 2015 at 06:09:17PM -0400, Tom Lane wrote:
>> It'd be easy enough to s/rmb/mb/ in 9.4 ... but not sure it's worth
>> the trouble, since we're desupporting Alpha as of 9.5 anyway.

> That is disappointing to hear.  Why is that?  It is still in use on
> Alpha.  What is the maintenance load for keeping the Alpha arch
> specific code?

The core problem is that Alpha's unusually lax memory coherency model
creates design and testing problems we face with no other architecture.
We're not really excited about carrying that burden for a legacy
architecture that isn't competitive in the modern world.  Even if we
were, it's completely impractical to maintain such an unusual port
when there is no representative of the architecture in our buildfarm
(http://buildfarm.postgresql.org/cgi-bin/show_status.pl).  It's worth
pointing out that had there been one, we would have noticed the rmb
problem long before 9.4 ever shipped.

I do not know anything about the prevalence of multi-CPU Alpha machines.
If they're thin on the ground compared to single-CPU, maybe we could just
document that we only support the latter, and not worry too much about
the memory coherency issues.  But in any case, without a commitment from
somebody to maintain an Alpha buildfarm member, we will absolutely not
consider reviving that port.
        regards, tom lane



Re: 9.4 broken on alpha

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Michael Cree wrote:
>> That is disappointing to hear.  Why is that?  It is still in use on
>> Alpha.  What is the maintenance load for keeping the Alpha arch
>> specific code?

> The amount of code that was removed by the commit isn't all that much:
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=a6d488cb538c8761658f0f7edfc40cecc8c29f2d
> but there's been rather a lot of work after that to add support for
> atomic primitives as well as barriers, which would presumably not
> trivial to implement and test on Alpha.  Someone would have to volunteer
> to writing, testing and maintaining that code.

As far as that goes, we do have fallback atomics code that's supposed to
work on anything (and not be unusably slow).  So in principle,
resurrecting the Alpha spinlock code ought to be enough to get back to the
previous level of support.  Coding Alpha atomic primitives would likely
be worth doing, if there's somebody out there who's excited enough to take
it on; but that could happen later, and incrementally.

> A buildfarm machine would be mandatory, too.

That, however, is not negotiable.
        regards, tom lane



Re: 9.4 broken on alpha

From
Andres Freund
Date:
On 2015-08-26 12:49:46 -0400, Tom Lane wrote:
> As far as that goes, we do have fallback atomics code that's supposed to
> work on anything (and not be unusably slow).  So in principle,
> resurrecting the Alpha spinlock code ought to be enough to get back to the
> previous level of support.  Coding Alpha atomic primitives would likely
> be worth doing, if there's somebody out there who's excited enough to take
> it on; but that could happen later, and incrementally.

Actually, on linux and most other OSs it should just use the generic gcc
based implementation and be pretty close to optimal. The only thing it'd
need would be to define the memory barriers, so the fallback
implementation of those isn't used.


But I really strongly object to re-introducing alpha support. Having to
care about data dependency barriers is a huge pita, and it complicates
code for everyone. And we'd have to investigate a lot of code to
actually make it work reliably. For what benefit?

> > A buildfarm machine would be mandatory, too.
> 
> That, however, is not negotiable.

If we really were to re-introduce this we'd need an actual developer
machine to run tests against.



Re: 9.4 broken on alpha

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> But I really strongly object to re-introducing alpha support. Having to
> care about data dependency barriers is a huge pita, and it complicates
> code for everyone. And we'd have to investigate a lot of code to
> actually make it work reliably. For what benefit?

I hear you, but that's only an issue for multi-CPU machines no?  If we
just say "we doubt this works on multi-CPU Alphas, if it breaks you get to
keep both pieces", then we're basically at the same place we were before.

To be clear: I don't want to do the work you're speaking of, either.
But if we have people who were successfully using PG on Alphas before,
the coherency issues must not have been a problem for them.  Can't we
just (continue to) ignore the issue?

> If we really were to re-introduce this we'd need an actual developer
> machine to run tests against.

I would certainly expect that we'd insist on active support from the Alpha
community; we're not going to continue to do this in an open-loop fashion.
        regards, tom lane



Re: 9.4 broken on alpha

From
Noah Misch
Date:
On Wed, Aug 26, 2015 at 12:49:46PM -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > A buildfarm machine would be mandatory, too.
> 
> That, however, is not negotiable.

Right.  I think the still-open question around PostgreSQL on Alpha is whether
9.1 through 9.4 are meaningfully supported there.  Step one for anyone
interested in Alpha support is to activate a buildfarm member covering that
range of versions.  Without that, the PostgreSQL community is just listening
for bug fix contributions.

On Wed, Aug 26, 2015 at 01:34:38PM -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > But I really strongly object to re-introducing alpha support. Having to
> > care about data dependency barriers is a huge pita, and it complicates
> > code for everyone. And we'd have to investigate a lot of code to
> > actually make it work reliably. For what benefit?
> 
> I hear you, but that's only an issue for multi-CPU machines no?  If we
> just say "we doubt this works on multi-CPU Alphas, if it breaks you get to
> keep both pieces", then we're basically at the same place we were before.
> 
> To be clear: I don't want to do the work you're speaking of, either.
> But if we have people who were successfully using PG on Alphas before,
> the coherency issues must not have been a problem for them.  Can't we
> just (continue to) ignore the issue?

The landscape changed with the 9.5 cycle's push to use more lock-free
algorithms.  Dropping Alpha support simplified review for those algorithms.
True, we could ignore Alpha for review purposes and accept unstudied damage to
reliability on Alpha.  To some extent, that characterizes any platform whose
test reports don't reach us.  It's different when we know Alpha has special
needs and we make changes in the area of those needs without even attempting
to meet them.  We made a decision to instead break compatibility explicitly,
and I don't think this thread has impugned that decision.

As it is, we've implicitly prepared to ship Alpha-supporting PostgreSQL 9.4
until 2019, by which time the newest Alpha hardware will be 15 years old.
Computer museums would be our only audience for continued support.  I do have
a sentimental weakness for computer museums, but not at the price of drag on
important performance work.

nm



Re: 9.4 broken on alpha

From
Christoph Berg
Date:
Re: Andrew Dunstan 2015-08-25 <55DC5F9E.60601@dunslane.net>
> >gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -g -O2 -Wformat
-Werror=format-security-I/usr/include/mit-krb5 -fPIC -pie -DLINUX_OOM_SCORE_ADJ=0 -I../../../src/include
-I/«PKGBUILDDIR»/build/../src/include-D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include/tcl8.6
-c-o bgworker.o /«PKGBUILDDIR»/build/../src/backend/postmaster/bgworker.c
 
> >/tmp/cc4j88on.s: Assembler messages:
> >/tmp/cc4j88on.s:952: Error: unknown opcode `rmb'
> >as: BFD (GNU Binutils for Debian) 2.25 internal error, aborting at ../../gas/write.c line 603 in size_seg
> 
> 
> needs a buildfarm animal. If we had one we'd presumably have caught this
> much earlier.

In the meantime, I've added this patch to the 9.4 Debian package, and
the build including check-world succeeds:

https://buildd.debian.org/status/fetch.php?pkg=postgresql-9.4&arch=alpha&ver=9.4.4-2&stamp=1440797195

It'd be nice if the patch could get applied to 9.4 and earlier.

Thanks,
Christoph
-- 
Senior Berater, Tel.: +49 (0)21 61 / 46 43-187
credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970  87C6 4C5A 6BAB 12D2 A7AE



Re: 9.4 broken on alpha

From
David Fetter
Date:
On Wed, Aug 26, 2015 at 09:19:09PM -0400, Noah Misch wrote:
> As it is, we've implicitly prepared to ship Alpha-supporting
> PostgreSQL 9.4 until 2019, by which time the newest Alpha hardware
> will be 15 years old.  Computer museums would be our only audience
> for continued support.  I do have a sentimental weakness for
> computer museums, but not at the price of drag on important
> performance work.

+1000

I think we need to take realistic stock of what we're doing and what
we should require.

At a minimum, we should de-support every platform on which literally
no new deployments will ever happen.

I'm looking specifically at you, HPUX, and I could make a pretty good
case for the idea that we can relegate 32-bit platforms to the ash
heap of history, at least on the server side.

Then, there's the question of rotating media.  Given the givens, we
ought to be drawing up plans for the cases where we might consider
supporting them, but those would need to be zero-based plans, i.e. the
starting point would be that we don't support them, and arguments
would have to be made affirmatively to support them for some specific,
demonstrable use case.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: 9.4 broken on alpha

From
Andres Freund
Date:
On 2015-08-29 08:32:29 -0700, David Fetter wrote:
> and I could make a pretty good
> case for the idea that we can relegate 32-bit platforms to the ash
> heap of history, at least on the server side.

Don't see the point, it doesn't cost us very much.

> Then, there's the question of rotating media.  Given the givens, we
> ought to be drawing up plans for the cases where we might consider
> supporting them, but those would need to be zero-based plans, i.e. the
> starting point would be that we don't support them, and arguments
> would have to be made affirmatively to support them for some specific,
> demonstrable use case.

We don't particularly care either way, so I don't see why we'd add/drop
support for anything here.



Re: 9.4 broken on alpha

From
Tom Lane
Date:
Christoph Berg <christoph.berg@credativ.de> writes:
> It'd be nice if the patch could get applied to 9.4 and earlier.

I've pushed that patch into 9.4.  Barring somebody stepping forward
with an offer of a buildfarm member and any other necessary developer
support, I do not think there will be any further consideration of
reversing our decision to drop Alpha support as of 9.5.
        regards, tom lane



Re: 9.4 broken on alpha

From
Christoph Berg
Date:
Re: Michael Cree 2015-08-26 <20150826052530.GA4256@tower>
> I reported the failure to build on Alpha, with an explanation and a
> patch to fix it, to the Debian package maintainers over a year ago,
> and within about of a month of version 9.4 being uploaded to Debian.
> 
> My recollection is that prior versions (9.2 and 9.3) compiled on
> Alpha so the use of the wrong barrier, and the fix, was in fact
> reported in a timely fashion following the first reasonable chance to
> observe the problem.
> 
> It has been built and running at Debian-Ports for over a year now as
> I uploaded the fixed version to the Alpha unreleased distribution.

Hi Michael,

(I've discovered this branch of this thread only now, I got removed
from CC.)

Sorry for letting that rot for so long - I'd blame the Debian
infrastructure for not showing ports information in the usual places.
I've really only discovered the problem because buildd.debian.org is
now showing the non-main architectures as well. (Of course we could
just have looked at the bug report...) I guess we should look into
making that even more visible. Is there a list of packages that have
ports-only patches applied which we could use to make maintainers
aware via ddpo/pts/tracker? Having a porter box available would help
as well.

> > It'd be easy enough to s/rmb/mb/ in 9.4 ... but not sure it's worth
> > the trouble, since we're desupporting Alpha as of 9.5 anyway.
> 
> That is disappointing to hear.  Why is that?  It is still in use on
> Alpha.  What is the maintenance load for keeping the Alpha arch
> specific code?

Fwiw I'd be curious to see if 9.5 still works using the generic
primitives, but atm it's blocking on perl5.20:

Dependency installability problem for postgresql-9.5 on alpha:

postgresql-9.5 build-depends on:
- alpha:libipc-run-perl
alpha:libipc-run-perl depends on:
- alpha:libio-pty-perl
alpha:libio-pty-perl depends on missing:
- alpha:perlapi-5.20.0

Christoph
-- 
cb@df7cb.de | http://www.df7cb.de/



Re: 9.4 broken on alpha

From
Tom Lane
Date:
David Fetter <david@fetter.org> writes:
> At a minimum, we should de-support every platform on which literally
> no new deployments will ever happen.
> I'm looking specifically at you, HPUX, and I could make a pretty good
> case for the idea that we can relegate 32-bit platforms to the ash
> heap of history, at least on the server side.

This wasn't responded to very much, but I wanted to put on record
that I don't agree with the concept.  There are several reasons:

1. "Run on every platform you can" is in the DNA of this and just about
every other successful open-source project.  You don't want to drive away
potential users by not supporting their platform.  If they're still
getting good use out of an old OS or non-mainstream architecture, who are
we to tell them not to?

2. Even if a particular platform is no longer a credible target for
production deployments, it can be a useful test case to ensure that we
don't get frozen into a narrow "FooOS on x86_64 is the only case worth
considering" straitjacket.  Software monocultures are bad news; they tend
not to adapt very well when the world changes.  So for instance I'm
reluctant to shut down pademelon, even though its compiler is old enough
to vote, because it's one of not too darn many buildfarm animals whose
compilers are not gcc or derivatives.  We need cases like that to keep us
from building in gcc-isms.  In short, supporting old platforms is one of
the ways that we stay flexible enough to be able to support new platforms
in the future.

3. I see no reason to desupport platforms when we don't gain anything by
it.  In the case of Alpha, it's pretty clear what we gain: we don't have
to worry about its unlike-anything-else memory coherency model.  (I'm not
very worried that future platforms will adopt that idea, either.)  And the
lack of any support from its remaining user community tilts the scales
pretty heavily against it.  I'll be happy to drop testing on HPUX 10.20,
or the ancient OS X versions my other buildfarm critters run, the minute
there is some feature we have a clear need for that one of them doesn't
have.  But I don't think it's desirable to cut anything off as long as
it's still able to run a buildfarm member.  I think those critters are
still capable of catching unexpected portability issues that might affect
more-viable platforms too.


A useful comparison point is the testing Greg Stark did recently for VAX.
Certainly no-one's ever again going to try to get useful work done with
Postgres on a VAX, but that still taught us some good things about
unnecessary IEEE-floating-point dependencies that had snuck into the code.
Someday, that might be important; IEEE 754 won't be the last word on
float arithmetic forever.

As an example of a desupport proposal that I think *is* well-founded,
see my nearby message <27975.1440961181@sss.pgh.pa.us>.
        regards, tom lane



Re: 9.4 broken on alpha

From
Thomas Munro
Date:
On Mon, Aug 31, 2015 at 8:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
A useful comparison point is the testing Greg Stark did recently for VAX.
Certainly no-one's ever again going to try to get useful work done with
Postgres on a VAX, but that still taught us some good things about
unnecessary IEEE-floating-point dependencies that had snuck into the code.
Someday, that might be important; IEEE 754 won't be the last word on
float arithmetic forever.

Just by the way, there is at least one example of a non-IEEE  floating point format supported by a current production compiler and hardware: IBM XL C on z/OS (and possibly other platforms) can use either IEEE or IBM's hex float format, depending on a compiler option.

https://en.wikipedia.org/wiki/IBM_Floating_Point_Architecture

--

Re: 9.4 broken on alpha

From
Robert Haas
Date:
On Sun, Aug 30, 2015 at 4:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> David Fetter <david@fetter.org> writes:
>> At a minimum, we should de-support every platform on which literally
>> no new deployments will ever happen.
>> I'm looking specifically at you, HPUX, and I could make a pretty good
>> case for the idea that we can relegate 32-bit platforms to the ash
>> heap of history, at least on the server side.
>
> This wasn't responded to very much, but I wanted to put on record
> that I don't agree with the concept.  There are several reasons:
>
> 1. "Run on every platform you can" is in the DNA of this and just about
> every other successful open-source project.  You don't want to drive away
> potential users by not supporting their platform.  If they're still
> getting good use out of an old OS or non-mainstream architecture, who are
> we to tell them not to?
>
> 2. Even if a particular platform is no longer a credible target for
> production deployments, it can be a useful test case to ensure that we
> don't get frozen into a narrow "FooOS on x86_64 is the only case worth
> considering" straitjacket.  Software monocultures are bad news; they tend
> not to adapt very well when the world changes.  So for instance I'm
> reluctant to shut down pademelon, even though its compiler is old enough
> to vote, because it's one of not too darn many buildfarm animals whose
> compilers are not gcc or derivatives.  We need cases like that to keep us
> from building in gcc-isms.  In short, supporting old platforms is one of
> the ways that we stay flexible enough to be able to support new platforms
> in the future.
>
> 3. I see no reason to desupport platforms when we don't gain anything by
> it.  In the case of Alpha, it's pretty clear what we gain: we don't have
> to worry about its unlike-anything-else memory coherency model.  (I'm not
> very worried that future platforms will adopt that idea, either.)  And the
> lack of any support from its remaining user community tilts the scales
> pretty heavily against it.  I'll be happy to drop testing on HPUX 10.20,
> or the ancient OS X versions my other buildfarm critters run, the minute
> there is some feature we have a clear need for that one of them doesn't
> have.  But I don't think it's desirable to cut anything off as long as
> it's still able to run a buildfarm member.  I think those critters are
> still capable of catching unexpected portability issues that might affect
> more-viable platforms too.

I agree with all this.

I doubt there is a big problem with supporting Alpha apart from
lock-free algorithms.  If critical sections are protected with locks,
I don't believe Alpha requires any special handling.  However,
lock-free algorithms are important.  Given a choice between continuing
to introduce more of them were as necessary to improve scalability and
performance, and continuing to support Alpha, I doubt anyone here is
prepared to vote for the latter.  Even if Alpha servers were readily
available both in the buildfarm and for developer testing, I suspect
most people here would be skeptical about requiring future lock-free
algorithms to support Alpha.  But since they aren't available,
imposing that requirement isn't even a realistic option.

The best argument for continuing to support Alpha is probably that
Linux does.  I don't know how they do that.  Presumably most Linux
kernel developers don't have access to Alpha hardware, which makes me
wonder how they avoid missing read_barrier_depends() in places where
it is needed (since it's a no-op everywhere else).  Considering that
Linux's use of lock-free algorithms is vastly more extensive than
ours, it would seem awfully difficult to avoid introducing bugs of
that type.

I previously argued that, rather than changing lwlock.c to use atomics
categorically (and falling back to atomics emulation when real atomics
are not available), we should have two implementations, one based on
atomics and the other relying only on spinlocks.  I believe if we'd
done that, we would be in a position to continue supporting Alpha and
whatever other weird stuff comes up in the future, because, again, I
think lock-based algorithms should be solid everywhere.  Once we
didn't take that path, I think the die was cast.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: 9.4 broken on alpha

From
Andres Freund
Date:
On 2015-09-01 14:40:36 -0400, Robert Haas wrote:
> I doubt there is a big problem with supporting Alpha apart from
> lock-free algorithms.

Note that we've had lock-free algorithms for years. E.g. the changecount
stuff in pgstat.c.

> The best argument for continuing to support Alpha is probably that
> Linux does.

Not sure why that's an argument? I mean linux supports architectures
without an MMU, but we'll surely never?

> I don't know how they do that.  Presumably most Linux
> kernel developers don't have access to Alpha hardware, which makes me
> wonder how they avoid missing read_barrier_depends() in places where
> it is needed (since it's a no-op everywhere else).

I think they do miss it regularly from what I'm skimming on these lists.

> Considering that Linux's use of lock-free algorithms is vastly more
> extensive than ours, it would seem awfully difficult to avoid
> introducing bugs of that type.

In a lot of cases they've embedded the read_barrier_depends() in
macros. E.g. when doing concurrent stuff involving rcu you're only ever
supposed to dereference memory using rcu_dereference() which on moset
architectures is just a volatile cast to force a read from memory, but
includes a smp_read_barrier_depends(). There's a bunch of other similar
cases.

> I previously argued that, rather than changing lwlock.c to use atomics
> categorically (and falling back to atomics emulation when real atomics
> are not available), we should have two implementations, one based on
> atomics and the other relying only on spinlocks.

I still think that'd have been a utter horrible mistake. lwlock.c is
already complicated enough. That it actually ends up being faster when
implemented using atomics implementation rather than spinlocks over the
full perdiod doesn't hurt either.

> I believe if we'd done that, we would be in a position to continue
> supporting Alpha and whatever other weird stuff comes up in the
> future, because, again, I think lock-based algorithms should be solid
> everywhere.  Once we didn't take that path, I think the die was cast.

I'm not following how those are related - the relevant pointer chasing
in lwlock.c should actually be safe on alpha (as done under a
spinlock). And whether lwlocks is implemented primarily using spinlocks
or atomics doesn't have a bearing on the data dependency barriers? There
might be a data dependency missing somewhere, but ...?

Since alpha has easy to use atomics support it'd actually have ended up
using the gcc generics and used the atomics implementation anyway.

Greetings,

Andres Freund



Re: 9.4 broken on alpha

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> The best argument for continuing to support Alpha is probably that
> Linux does.  I don't know how they do that.

My sneaking suspicion is that they don't very well.  In particular,
unless I misunderstand things fundamentally, the coherency issues
would be invisible without a multi-CPU machine, and there are probably
not that many multi-CPU Alphas still alive.  The kernel could well be
full of bugs that don't manifest on single-CPU Alphas.

I also note that nominal support is quite different from being production
grade.  Red Hat, for instance, never supported Alpha hardware (at least
not while I was there), and I doubt that any other commercial Linux
support provider has supported it in a long time either.  If there were
bugs, how many people would notice or care?
        regards, tom lane



Re: 9.4 broken on alpha

From
Robert Haas
Date:
On Tue, Sep 1, 2015 at 2:54 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2015-09-01 14:40:36 -0400, Robert Haas wrote:
>> I doubt there is a big problem with supporting Alpha apart from
>> lock-free algorithms.
>
> Note that we've had lock-free algorithms for years. E.g. the changecount
> stuff in pgstat.c.

Hmm, true.  I think that stuff is probably missing some barriers that
are technically required even on mainstream platforms.  But the races
are narrow, so you may not see any problem in practice, and if you do,
the worst that'll happen is some junk in pg_stat_activity.

>> The best argument for continuing to support Alpha is probably that
>> Linux does.
>
> Not sure why that's an argument? I mean linux supports architectures
> without an MMU, but we'll surely never?

I'm just saying that, we're arguing that we can't do it, but they're
doing it, so presumably we could find a way if we were really
determined.  I'm not saying that it's a good use of time, but Linux
seems to think it is.

>> I previously argued that, rather than changing lwlock.c to use atomics
>> categorically (and falling back to atomics emulation when real atomics
>> are not available), we should have two implementations, one based on
>> atomics and the other relying only on spinlocks.
>
> I still think that'd have been a utter horrible mistake. lwlock.c is
> already complicated enough. That it actually ends up being faster when
> implemented using atomics implementation rather than spinlocks over the
> full perdiod doesn't hurt either.

I don't know what a "perdiod" is.

>> I believe if we'd done that, we would be in a position to continue
>> supporting Alpha and whatever other weird stuff comes up in the
>> future, because, again, I think lock-based algorithms should be solid
>> everywhere.  Once we didn't take that path, I think the die was cast.
>
> I'm not following how those are related - the relevant pointer chasing
> in lwlock.c should actually be safe on alpha (as done under a
> spinlock). And whether lwlocks is implemented primarily using spinlocks
> or atomics doesn't have a bearing on the data dependency barriers? There
> might be a data dependency missing somewhere, but ...?
>
> Since alpha has easy to use atomics support it'd actually have ended up
> using the gcc generics and used the atomics implementation anyway.

If all of your concurrency control looks like this:

SpinLockAcquire(&mutex);  // barrier
// do stuff
SpinLockRelease(&mutex); // also a barrier

...then I think it doesn't matter what wonky stuff Alpha does.  Mutual
exclusion is mutual exclusion, full stop.  When you start doing things
that use pg_atomic_uint32, or, as you mention, the st_changecount
protocol, you are now potentially relying on memory-ordering semantics
that may vary among platforms.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: 9.4 broken on alpha

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Sep 1, 2015 at 2:54 PM, Andres Freund <andres@anarazel.de> wrote:
>> On 2015-09-01 14:40:36 -0400, Robert Haas wrote:
>>> The best argument for continuing to support Alpha is probably that
>>> Linux does.

>> Not sure why that's an argument? I mean linux supports architectures
>> without an MMU, but we'll surely never?

> I'm just saying that, we're arguing that we can't do it, but they're
> doing it, so presumably we could find a way if we were really
> determined.  I'm not saying that it's a good use of time, but Linux
> seems to think it is.

I think we've probably beat this to death.  Nobody here believes that
it's sane to try to support Alpha without access to hardware, and no
offer of hardware has been forthcoming.  If one were to materialize,
we could usefully have a debate about whether it's worth doing ...
        regards, tom lane



Re: 9.4 broken on alpha

From
Peter Geoghegan
Date:
On Tue, Sep 1, 2015 at 1:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I think we've probably beat this to death.  Nobody here believes that
> it's sane to try to support Alpha without access to hardware, and no
> offer of hardware has been forthcoming.  If one were to materialize,
> we could usefully have a debate about whether it's worth doing ...

I agree. I can't believe how seriously Alpha support has been debated
here. I think that the Linux implementation is simply very limited, or
broken.

Andres mentioned Linux supporting systems without MMUs/paging. I
imagine this was based on this paragraph in the Linux README:
 Linux is easily portable to most general-purpose 32- or 64-bit architectures as long as they have a paged memory
managementunit (PMMU) and a port of the GNU C compiler (gcc) (part of The GNU Compiler Collection, GCC). Linux has also
beenported to a number of architectures without a PMMU, although functionality is then obviously somewhat limited.
 

I'm not sure how or to what degree these systems lacking an MMU have
limited support, but I think it's fair to speculate that Alpha may
similarly have severe limitations, or even severe bugs (just like
Postgres 9.4's Alpha support).

-- 
Peter Geoghegan



Re: 9.4 broken on alpha

From
"Joshua D. Drake"
Date:
On 09/01/2015 01:18 PM, Peter Geoghegan wrote:
> On Tue, Sep 1, 2015 at 1:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think we've probably beat this to death.  Nobody here believes that
>> it's sane to try to support Alpha without access to hardware, and no
>> offer of hardware has been forthcoming.  If one were to materialize,
>> we could usefully have a debate about whether it's worth doing ...
>
> I agree. I can't believe how seriously Alpha support has been debated
> here. I think that the Linux implementation is simply very limited, or
> broken.

It isn't even made any more. Alpha is dead except for obscure hobbyists. 
We aren't Debian, we should be much more stringent on the platforms we 
support.

Sincerely,

JD


-- 
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.



Re: 9.4 broken on alpha

From
Robert Haas
Date:
On Tue, Sep 1, 2015 at 4:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Tue, Sep 1, 2015 at 2:54 PM, Andres Freund <andres@anarazel.de> wrote:
>>> On 2015-09-01 14:40:36 -0400, Robert Haas wrote:
>>>> The best argument for continuing to support Alpha is probably that
>>>> Linux does.
>
>>> Not sure why that's an argument? I mean linux supports architectures
>>> without an MMU, but we'll surely never?
>
>> I'm just saying that, we're arguing that we can't do it, but they're
>> doing it, so presumably we could find a way if we were really
>> determined.  I'm not saying that it's a good use of time, but Linux
>> seems to think it is.
>
> I think we've probably beat this to death.  Nobody here believes that
> it's sane to try to support Alpha without access to hardware, and no
> offer of hardware has been forthcoming.  If one were to materialize,
> we could usefully have a debate about whether it's worth doing ...

Yep.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company