Thread: [HACKERS] powerpc(32) point/polygon regression failures on Debian Jessie

[HACKERS] powerpc(32) point/polygon regression failures on Debian Jessie

From
Christoph Berg
Date:
The point/polygon regression tests have started to fail on 32-bit
powerpc on Debian Jessie. So far I could reproduce the problem with
PostgreSQL 9.4.10+11 and 9.6.1, on several different machines. Debian
unstable is unaffected.

The failure looks like this:

https://buildd.debian.org/status/fetch.php?pkg=postgresql-9.6&arch=powerpc&ver=9.6.1-2~bpo8%2B1&stamp=1485184696&raw=0

******** build/src/test/regress/regression.diffs ********
*** /«PKGBUILDDIR»/build/../src/test/regress/expected/point.out    Mon Oct 24 20:08:51 2016
--- /«PKGBUILDDIR»/build/src/test/regress/results/point.out    Mon Jan 23 15:17:51 2017
***************
*** 125,131 ****      | (-3,4)     |                5      | (-10,0)    |               10      | (-5,-12)   |
    13
 
!      | (10,10)    |  14.142135623731      | (5.1,34.5) | 34.8749193547455 (6 rows) 
--- 125,131 ----      | (-3,4)     |                5      | (-10,0)    |               10      | (-5,-12)   |
    13
 
!      | (10,10)    | 14.1421356237309      | (5.1,34.5) | 34.8749193547455 (6 rows) 
***************
*** 150,157 ****            | (-5,-12)   | (-10,0)    |               13            | (-5,-12)   | (0,0)      |
     13            | (0,0)      | (-5,-12)   |               13
 
!            | (0,0)      | (10,10)    |  14.142135623731
!            | (10,10)    | (0,0)      |  14.142135623731            | (-3,4)     | (10,10)    | 14.3178210632764
    | (10,10)    | (-3,4)     | 14.3178210632764            | (-5,-12)   | (-3,4)     | 16.1245154965971
 
--- 150,157 ----            | (-5,-12)   | (-10,0)    |               13            | (-5,-12)   | (0,0)      |
     13            | (0,0)      | (-5,-12)   |               13
 
!            | (0,0)      | (10,10)    | 14.1421356237309
!            | (10,10)    | (0,0)      | 14.1421356237309            | (-3,4)     | (10,10)    | 14.3178210632764
    | (10,10)    | (-3,4)     | 14.3178210632764            | (-5,-12)   | (-3,4)     | 16.1245154965971
 
***************
*** 221,227 ****          | (-10,0)    | (0,0)      |               10          | (-10,0)    | (-5,-12)   |
 13          | (-5,-12)   | (0,0)      |               13
 
!          | (0,0)      | (10,10)    |  14.142135623731          | (-3,4)     | (10,10)    | 14.3178210632764
|(-5,-12)   | (-3,4)     | 16.1245154965971          | (-10,0)    | (10,10)    | 22.3606797749979
 
--- 221,227 ----          | (-10,0)    | (0,0)      |               10          | (-10,0)    | (-5,-12)   |
 13          | (-5,-12)   | (0,0)      |               13
 
!          | (0,0)      | (10,10)    | 14.1421356237309          | (-3,4)     | (10,10)    | 14.3178210632764
|(-5,-12)   | (-3,4)     | 16.1245154965971          | (-10,0)    | (10,10)    | 22.3606797749979
 

======================================================================

*** /«PKGBUILDDIR»/build/../src/test/regress/expected/polygon.out    Mon Oct 24 20:08:51 2016
--- /«PKGBUILDDIR»/build/src/test/regress/results/polygon.out    Mon Jan 23 15:17:51 2017
***************
*** 222,229 ****     '(2,2)'::point <-> '((0,0),(1,4),(3,1))'::polygon as inside,     '(3,3)'::point <->
'((0,2),(2,0),(2,2))'::polygonas near_corner,     '(4,4)'::point <-> '((0,0),(0,3),(4,0))'::polygon as near_segment;
 
!  on_corner | on_segment | inside |   near_corner   | near_segment 
! -----------+------------+--------+-----------------+--------------
!          0 |          0 |      0 | 1.4142135623731 |          3.2 (1 row) 
--- 222,229 ----     '(2,2)'::point <-> '((0,0),(1,4),(3,1))'::polygon as inside,     '(3,3)'::point <->
'((0,2),(2,0),(2,2))'::polygonas near_corner,     '(4,4)'::point <-> '((0,0),(0,3),(4,0))'::polygon as near_segment;
 
!  on_corner | on_segment | inside |   near_corner    | near_segment 
! -----------+------------+--------+------------------+--------------
!          0 |          0 |      0 | 1.41421356237309 |          3.2 (1 row) 


The 9.4.11 log contains the same point.out diff, but not polygon.out:

https://buildd.debian.org/status/fetch.php?pkg=postgresql-9.4&arch=powerpc&ver=9.4.11-0%2Bdeb8u1&stamp=1487517299&raw=0

Does that ring any bell? As Debian unstable is unaffected, it's likely
the toolchain to be blamed, but it worked for Debian Jessie before.

Christoph
-- 
Senior Berater, Tel.: +49 2166 9901 187
credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970  87C6 4C5A 6BAB 12D2 A7AE



Christoph Berg <christoph.berg@credativ.de> writes:
> The point/polygon regression tests have started to fail on 32-bit
> powerpc on Debian Jessie. So far I could reproduce the problem with
> PostgreSQL 9.4.10+11 and 9.6.1, on several different machines. Debian
> unstable is unaffected.

Hmph.  We haven't touched that code in awhile, and certainly not in the
9.4.x branch.  I'd have to agree that this must be a toolchain change.
        regards, tom lane



Re: [HACKERS] powerpc(32) point/polygon regression failures onDebian Jessie

From
Christoph Berg
Date:
Re: Tom Lane 2017-02-20 <30737.1487598355@sss.pgh.pa.us>
> Hmph.  We haven't touched that code in awhile, and certainly not in the
> 9.4.x branch.  I'd have to agree that this must be a toolchain change.

FYI, in the meantime we could indeed trace it back to an libc issue on
Jessie:

$ cat sqrt.c 
#include <math.h>
#include <stdio.h>
#include <fenv.h>

double
pg_hypot(double x, double y)
{   double      yx;
   /* Some PG-specific code deleted here */
   /* Else, drop any minus signs */   x = fabs(x);   y = fabs(y);
   /* Swap x and y if needed to make x the larger one */   if (x < y)   {       double      temp = x;
       x = y;       y = temp;   }
   /*    * If y is zero, the hypotenuse is x.  This test saves a few cycles in    * such cases, but more importantly it
alsoprotects against    * divide-by-zero errors, since now x >= y.    */   if (y == 0.0)       return x;
 
   /* Determine the hypotenuse */   yx = y / x;   return x * sqrt(1.0 + (yx * yx));
}


int main ()
{       //fesetround(FE_TONEAREST);       printf("fegetround is %d\n", fegetround());       double r = pg_hypot(10.0,
10.0);      printf("14 %.14g\n", r);       printf("15 %.15g\n", r);       printf("16 %.16g\n", r);       printf("17
%.17g\n",r);       return 0;
 
}


Jessie output:
fegetround is 0
14 14.142135623731
15 14.1421356237309
16 14.14213562373095
17 14.142135623730949

Sid output:
fegetround is 0
14 14.142135623731
15 14.142135623731
16 14.14213562373095
17 14.142135623730951


The Sid output is what the point and polygon tests are expecting.

Possible culprit is this bug report from November:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=843904
(Though that doesn't explain why it affects 32bit powerpc only.)

Christoph
-- 
Senior Berater, Tel.: +49 2166 9901 187
credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970  87C6 4C5A 6BAB 12D2 A7AE



Christoph Berg <christoph.berg@credativ.de> writes:
> Re: Tom Lane 2017-02-20 <30737.1487598355@sss.pgh.pa.us>
>> Hmph.  We haven't touched that code in awhile, and certainly not in the
>> 9.4.x branch.  I'd have to agree that this must be a toolchain change.

> FYI, in the meantime we could indeed trace it back to an libc issue on
> Jessie:

I wonder whether it's a compiler change, maybe along the lines of
rearranging the computation so that it gives a slightly different result.
Although you'd think that 10.0/10.0 would give exactly 1.0 no matter what.
Still, it'd be worth comparing the assembly code for your test program.
        regards, tom lane



Re: [HACKERS] powerpc(32) point/polygon regression failures onDebian Jessie

From
Christoph Berg
Date:
Re: Tom Lane 2017-02-20 <13825.1487607143@sss.pgh.pa.us>
> Still, it'd be worth comparing the assembly code for your test program.

I was compiling the program on jessie and on sid, and running the
jessie binary on sid made it output the same as the sid binary, so the
difference isn't in the binary, but in some system library.

Christoph
-- 
Senior Berater, Tel.: +49 2166 9901 187
credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970  87C6 4C5A 6BAB 12D2 A7AE



Re: [HACKERS] powerpc(32) point/polygon regression failures onDebian Jessie

From
Christoph Berg
Date:
Re: To Tom Lane 2017-02-20 <20170220161556.5ukosuj5o572b4rn@msg.credativ.de>
> I was compiling the program on jessie and on sid, and running the
> jessie binary on sid made it output the same as the sid binary, so the
> difference isn't in the binary, but in some system library.

Fwiw, the problem will be fixed in Jessie's glibc by backporting this update:

2015-02-12  Joseph Myers  <joseph@codesourcery.com>
[BZ #17964]* sysdeps/powerpc/fpu/e_sqrt.c (__slow_ieee754_sqrt): Use__builtin_fma instead of relying on contraction of
a* b + c.
 

https://anonscm.debian.org/cgit/pkg-glibc/glibc.git/commit/?h=jessie&id=b26c084f6eba0057b1cd93e0caf424a1d06bd97e

(Upstream it's probably one of these, didn't dig deeper:
https://sourceware.org/git/?p=glibc.git&a=search&h=HEAD&st=commit&s=__builtin_fma)

Thanks for the input,
Christoph
-- 
Senior Berater, Tel.: +49 2166 9901 187
credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970  87C6 4C5A 6BAB 12D2 A7AE