Thread: Re: Unix Domain Sockets error (was Re: [HACKERS] Alpha initdb fixed!)

Re: Unix Domain Sockets error (was Re: [HACKERS] Alpha initdb fixed!)

From
"Pedro J. Lobo"
Date:
On Wed, 18 Mar 1998, Pedro J. Lobo wrote:

>On Tue, 17 Mar 1998, Dwayne Bailey wrote:
>
>>Re: your suggestion to use __alpha and not worry about the
>>makefile, I'm a little uncomfortable with that.  DEC's cc will
>>actually output different symbols, depending on the use of the
>>- -std flag.  I'd rather have something that we have explicit
>>control over, rather than relying on the compiler like this.  I'm
>>not violently opposed to useing __alpha or anything, it's just a
>>preference against it.
>

[stuff deleted...]

>As you can see, __alpha and __osf__ are always defined. However, I
>understand your point. If we define 'alpha' in the template file, we are
>protected from mind-changing vendors that define __alpha in DU 3.2 and
>__alpha__ in DU 4.0 and alpha__ in DU 5.0 (just an example). From this
>point of view, the current approach is better. And, it's always easier
>(and safer) to leave things untouched.

Just a thought: I think we should make a distinction between architecture
(i.e., define 'alpha') and OS (i.e., define 'osf' or something like that),
now that linux runs also on alpha (and NT, if someone ever makes a port).

-------------------------------------------------------------------
Pedro José Lobo Perea                   Tel:    +34 1 336 78 19
Centro de Cálculo                       Fax:    +34 1 331 92 29
EUIT Telecomunicación - UPM             e-mail: pjlobo@euitt.upm.es


Re: Unix Domain Sockets error (was Re: [HACKERS] Alpha initdb fixed!)

From
"Pedro J. Lobo"
Date:
On Wed, 18 Mar 1998, Thomas G. Lockhart wrote:

>> hash_index ..  ok
>> select_views ..  ok
>> alter_table ..  ok
>> portals_p2 ..  ok
>> ==========================================
>>
>> Some of them fail (most notably int2, int4 and float8), but anyway it's
>> better than before :-)
>
>Oooh. I think you might have a running system now! Those int2, int4,

Yes, it seems so.

>float8 "failures" are probably just error message differences and are
>expected.

Yes. For int2: Expected:
! ERROR:  pg_atoi: error reading "100000": Math result not representable

Got:
! ERROR:  pg_atoi: error reading "100000": Result too large

For int4: Expected:
! ERROR:  pg_atoi: error reading "1000000000000": Math result not
representable

Got:
! ERROR:  pg_atoi: error reading "1000000000000": Result too large

The same goes for oidint2 and oidint4.

For float8: Expected:
! ERROR:  Bad float8 input format -- overflow

Got:
! ERROR:  floating point exception! The last floating point operation
either exceeded legal ranges or was a divide by zero

This one was harmless, but there is another one: Expected:
  QUERY: SELECT '' AS bad, : (f.f1) from FLOAT8_TBL f;
! bad|            ?column?
! ---+--------------------
!    |                   1
!    |7.39912306090513e-16
!    |                   0
!    |                   0
!    |                   1
! (5 rows)
!

Got:
  QUERY: SELECT '' AS bad, : (f.f1) from FLOAT8_TBL f;
! ERROR:  exp() result is out of range

Can someone comment on this?

>The date and time stuff may or may not be a problem, and the
>geometry stuff is probably OK (rounding trouble in the math libraries).

You are right on the geometry stuff. I am not sure about the date stuff.
Some are differences of one second between the expected and the actual
results, some others are dates that appear displaced by 19 years (for
example, expecter year 1997 becomes 2016, expected 1957 becomes 1976...).
The diff output is very long on this.

>Make sure your date/time stuff looks OK, at least for simple tests; it
>may be, for example, that your timezone database is just different for
>dates before 1960...

The date/time stuff has never worked completely right. And, if the problem
lies in postgres, that's ok. Sooner or later it will be fixed. But if, as
it seems, the problem lies in the timezone databases, we might be in big
trouble. Perhaps we could make a test, so we can say for sure "your
timezone database is incorrect, go and ask your verdor for a patch".

Also, the test fails form the random stuff:
*** expected/random.out ma 29 abr 07:23:40 1997
--- results/random.out  ma 17 mar 03:51:57 1998
***************
*** 7,18 ****
  QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
  count
  -----
!    92
  (1 row)

  QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
  count
  -----
!    98
  (1 row)

--- 7,18 ----
  QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
  count
  -----
!    95
  (1 row)

  QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
  count
  -----
!    88
  (1 row)


----------------------


Yes, the results are different, but... aren't they random? O:-)

-------------------------------------------------------------------
Pedro José Lobo Perea                   Tel:    +34 1 336 78 19
Centro de Cálculo                       Fax:    +34 1 331 92 29
EUIT Telecomunicación - UPM             e-mail: pjlobo@euitt.upm.es


Re: Unix Domain Sockets error (was Re: [HACKERS] Alpha initdb fixed!)

From
"Thomas G. Lockhart"
Date:
> This one was harmless, but there is another one: Expected:
>   QUERY: SELECT '' AS bad, : (f.f1) from FLOAT8_TBL f;
> ! bad|            ?column?
> ! ---+--------------------
> !    |                   1
> !    |7.39912306090513e-16
> !    |                   0
> !    |                   0
> !    |                   1
> ! (5 rows)
> !
>
> Got:
>   QUERY: SELECT '' AS bad, : (f.f1) from FLOAT8_TBL f;
> ! ERROR:  exp() result is out of range
>
> Can someone comment on this?

I think you are getting a better result than the regression test machine
gets. That's good.

> Some are differences of one second between the expected and the actual
> results, some others are dates that appear displaced by 19 years (for
> example, expecter year 1997 becomes 2016, expected 1957 becomes
> 1976...). The diff output is very long on this.
> The date/time stuff has never worked completely right. And, if the
> problem lies in postgres, that's ok. Sooner or later it will be fixed.
> But if, as it seems, the problem lies in the timezone databases, we
> might be in big trouble. Perhaps we could make a test, so we can say
> for sure "your timezone database is incorrect, go and ask your verdor
> for a patch".

No, you still have date/time trouble, and it looks as though the
timezone stuff is not being set properly. By definition, it is a problem
with your machine, since the code works on several other platforms, and
no, it isn't likely to get fixed eventually unless you pursue it, since
it does work on the ~20 other OS/processor combinations listed as
supported platforms.

OK, what I meant by "timezone database" trouble would have been sort of
obvious in that only dates from times before computers existed would
have shown problems, and then usually 1 hour differences due to daylight
savings time settings. That is not what you are seeing.

The 19 year differences usually seem to come from mis-handling the
HAVE_INT_TIMEZONE compile-time option. How is yours set? Try changing it
in config.h and see if it helps.

> Yes, the results are different, but... aren't they random? O:-)

Right. OK for random to be different.

                        - Tom

Re: Unix Domain Sockets error (was Re: [HACKERS] Alpha initdb fixed!)

From
"Pedro J. Lobo"
Date:
On Wed, 18 Mar 1998, Thomas G. Lockhart wrote:

>> Got:
>>   QUERY: SELECT '' AS bad, : (f.f1) from FLOAT8_TBL f;
>> ! ERROR:  exp() result is out of range
>>
>> Can someone comment on this?
>
>I think you are getting a better result than the regression test machine
>gets. That's good.

Ok.

>> Some are differences of one second between the expected and the actual
>> results, some others are dates that appear displaced by 19 years (for
>> example, expecter year 1997 becomes 2016, expected 1957 becomes
>> 1976...). The diff output is very long on this.
>> The date/time stuff has never worked completely right. And, if the
>> problem lies in postgres, that's ok. Sooner or later it will be fixed.
>> But if, as it seems, the problem lies in the timezone databases, we
>> might be in big trouble. Perhaps we could make a test, so we can say
>> for sure "your timezone database is incorrect, go and ask your verdor
>> for a patch".
>
>No, you still have date/time trouble, and it looks as though the
>timezone stuff is not being set properly. By definition, it is a problem
>with your machine, since the code works on several other platforms, and
>no, it isn't likely to get fixed eventually unless you pursue it, since
>it does work on the ~20 other OS/processor combinations listed as
>supported platforms.

You have misinterpreted me. What I mean is that if the problem lies in
postgres, we can hunt it and fix it, but if the problem lies in the
timezone libraries then it is out of our hands. Of course, the problem
isn't going to vanish into nothingness by itself (although it would be
very nice, wouldn't it? :-)

>OK, what I meant by "timezone database" trouble would have been sort of
>obvious in that only dates from times before computers existed would
>have shown problems, and then usually 1 hour differences due to daylight
>savings time settings. That is not what you are seeing.
>
>The 19 year differences usually seem to come from mis-handling the
>HAVE_INT_TIMEZONE compile-time option. How is yours set? Try changing it
>in config.h and see if it helps.

I am going to be offline for 4 days, until next Monday. I will dig into
that problem then.

-------------------------------------------------------------------
Pedro José Lobo Perea                   Tel:    +34 1 336 78 19
Centro de Cálculo                       Fax:    +34 1 331 92 29
EUIT Telecomunicación - UPM             e-mail: pjlobo@euitt.upm.es


Timezone problems / HAVE_INT_TIMEZINE

From
Mattias Kregert
Date:
Thomas G. Lockhart wrote:

> The 19 year differences usually seem to come from mis-handling the
> HAVE_INT_TIMEZONE compile-time option. How is yours set? Try changing it
> in config.h and see if it helps.
>

Couldn't this be tested for, just like there is a "flex test" which finds
out if flex is ok or not?
Can the configure script find out and add HAVE_INT_TIMEZONE if appropriate?

/* m */



Re: Timezone problems / HAVE_INT_TIMEZINE

From
"Thomas G. Lockhart"
Date:
> Couldn't this be tested for, just like there is a "flex test" which
> finds out if flex is ok or not? Can the configure script find out and
> add HAVE_INT_TIMEZONE if appropriate?

Uh, it does a test already by trying to compile a program referencing a
global integer variable called "timezone". Somehow a few systems will
compile that but don't really have a useful integer timezone
(RH5.0/glibc2.0 is one of those).

I'm wondering if we could change the sense of the test, to try instead
to test for the presence of a timezone field in the tm structure? That
might fix the glibc2.0 port (assuming it still has problems at v2.0.7;
haven't tested recently) but I don't know which other ports might break.

Can we experiment with this Marc?? Post-megapatch of course :)

                      - Tom

Re: Timezone problems / HAVE_INT_TIMEZINE

From
The Hermit Hacker
Date:
On Thu, 19 Mar 1998, Thomas G. Lockhart wrote:

> > Couldn't this be tested for, just like there is a "flex test" which
> > finds out if flex is ok or not? Can the configure script find out and
> > add HAVE_INT_TIMEZONE if appropriate?
>
> Uh, it does a test already by trying to compile a program referencing a
> global integer variable called "timezone". Somehow a few systems will
> compile that but don't really have a useful integer timezone
> (RH5.0/glibc2.0 is one of those).
>
> I'm wondering if we could change the sense of the test, to try instead
> to test for the presence of a timezone field in the tm structure? That
> might fix the glibc2.0 port (assuming it still has problems at v2.0.7;
> haven't tested recently) but I don't know which other ports might break.
>
> Can we experiment with this Marc?? Post-megapatch of course :)

    Sounds reasonable to me...so you want the test changed to:

===========================================================================
#include <stdio.h>
#include <time.h>

main() { struct tm *tmstruct; printf("%s\n", tmstruct->timezone); }
===========================================================================

    And, if the compile fails...how is HAVE_INT_TIMEZONE set?  to
FALSE?

Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org


Re: Timezone problems / HAVE_INT_TIMEZINE

From
"Thomas G. Lockhart"
Date:
>         Sounds reasonable to me...so you want the test changed to:
========================================================================
> #include <stdio.h>
> #include <time.h>
>
> main() { struct tm *tmstruct; printf("%s\n", tmstruct->timezone); }
> ========================================================================

The structure member looks like tm->tm_gmtoff (an integer). There would
need to be other calls to set it up, unless something like

  main() {struct tm tmstruct, *tm = &tmstruct; tm->tm_gmtoff = 0; }

would be acceptable.

> And, if the compile fails...how is HAVE_INT_TIMEZONE set?  to
> FALSE?

Actually, if the test fails, then we need to #undef HAVE_INT_TIMEZONE,
although if it would be easier to set it to FALSE then I can pretty
easily fix up the sources to use that.

                  - Tom

Re: [HACKERS] Timezone problems / HAVE_INT_TIMEZINE

From
Dwayne Bailey
Date:
-----BEGIN PGP SIGNED MESSAGE-----

Thomas G. Lockhart wrote:

> The 19 year differences usually seem to come from mis-handling the
> HAVE_INT_TIMEZONE compile-time option. How is yours set? Try changing it
> in config.h and see if it helps.
>

As far as I've been able to determine, the correct setting for
HAVE_INT_TIMEZONE (1) is being used in the Alpha port.  It does
in fact define 'long timezone' (not 'int timezone') as being
available, as part of the tzset() man page.  I have to admit that
I'm not familiar with the way that this is supposed to work, so
this may seem kind of dumb, but I did some experimenting on the
value of 'timezone' and 'tzname', since the contents of those
variable weren't documented anywhere that I could find in DEC's
man pages.  I of course now know that tzname[0] is the base
timezone name, tzname[1] is the dst name, and timezone is the
number of seconds offset from GMT.

However, what I also discovered in that these values are not set
until after the tzset() routine is called.  Is that normal
behavior?  Doing a grep for tzset in the PG sources revealed
that it's only called for a few SQL commands.  Is it called
anywhere as part of startup processing, and I'm just missing it?
Or is the DEC implementation the only one that requires an
explicit tzset() call before the use of these variables?

- --
Dwayne Bailey                   + WHAT is your name? Sir Galahad
MIKA Systems, Bingham Farms, MI + WHAT is your quest? I Seek the Holy Grail
dwayne@mika.com                 + What is your favorite color?
http://www.mika.com/~dwayne     +    Blue ... no, Yelloooooooooooooooooow
            finger dwayne@mika20.mika.com for PGP Public Key

-----BEGIN PGP SIGNATURE-----
Version: 2.6.2

iQB1AwUBNREaJqA2uleK7maRAQGvdwL9F5t3M1dK8Qf9MVWGa3CfKguegHyG/f9+
1Oe3OETtA5gI0GLUJkxgpVBQFMzT6kczju1AR6l7JcM2N+wXMk1lE5ULrLH96axd
T8sLQwkdjTWhNsnBBFulyocyoLPF7TzK
=SbKH
-----END PGP SIGNATURE-----


Re: [HACKERS] Timezone problems / HAVE_INT_TIMEZINE

From
Maarten Boekhold
Date:
> However, what I also discovered in that these values are not set
> until after the tzset() routine is called.  Is that normal
> behavior?  Doing a grep for tzset in the PG sources revealed
> that it's only called for a few SQL commands.  Is it called
> anywhere as part of startup processing, and I'm just missing it?
> Or is the DEC implementation the only one that requires an
> explicit tzset() call before the use of these variables?

AFAIK tzset() is called automagically by all time-related libc routines
when they detect it is not set yet (at least I think with Linux it is
done this way. It's been a long time since I looked at that).

Maarten

_____________________________________________________________________________
| TU Delft, The Netherlands, Faculty of Information Technology and Systems  |
|                   Department of Electrical Engineering                    |
|           Computer Architecture and Digital Technique section             |
|                          M.Boekhold@et.tudelft.nl                         |
-----------------------------------------------------------------------------

Re: [HACKERS] Timezone problems / HAVE_INT_TIMEZINE

From
Dwayne Bailey
Date:
-----BEGIN PGP SIGNED MESSAGE-----

On Thu, 19 Mar 1998, Maarten Boekhold wrote:

> AFAIK tzset() is called automagically by all time-related libc routines
> when they detect it is not set yet (at least I think with Linux it is
> done this way. It's been a long time since I looked at that).

That would explain it then.  I was just accessing the variables
directly, without any intervening calls.

It's a moot point, anyway.  I put explicit calls in to the
startup, and it made no difference in the result.  It's likely to
be a 32/64 bit issue somewhere that I haven't located yet.  It
really shouldn't be that hard to track down.  Since the output is
different from the input by a consistance amount (19 years +- a
few days) it can only be in one of 4 places, AFAIK: parsing
input, storing value, retrieving value, or generating output.  My
bet is on the retrieve phase, but we'll see.

- --
Dwayne Bailey                   + WHAT is your name? Sir Galahad
MIKA Systems, Bingham Farms, MI + WHAT is your quest? I Seek the Holy Grail
dwayne@mika.com                 + What is your favorite color?
http://www.mika.com/~dwayne     +    Blue ... no, Yelloooooooooooooooooow
            finger dwayne@mika20.mika.com for PGP Public Key

-----BEGIN PGP SIGNATURE-----
Version: 2.6.2

iQB1AwUBNRE5eqA2uleK7maRAQGqPQMAgajIzCAK8cBRmqCHw83mVyI8i5YI7yo4
j0jhJXG3vEauLST0B+6ompKw0+KQvRoOfgFWOoyqelZ08zo6qCBrJJmuAbGSM1/b
EbBtsORCpSymqaeDIIPHoPdaq+jG9c8e
=BiGQ
-----END PGP SIGNATURE-----


Re: [HACKERS] Timezone problems / HAVE_INT_TIMEZINE

From
"Thomas G. Lockhart"
Date:
> It's a moot point, anyway.  I put explicit calls in to the
> startup, and it made no difference in the result.  It's likely to
> be a 32/64 bit issue somewhere that I haven't located yet.  It
> really shouldn't be that hard to track down.  Since the output is
> different from the input by a consistance amount (19 years +- a
> few days) it can only be in one of 4 places, AFAIK: parsing
> input, storing value, retrieving value, or generating output.  My
> bet is on the retrieve phase, but we'll see.

Didn't this stuff work for v6.2.1, even on Alpha? afaik nothing around
this adt code changed recently...

                        - Tom

I moved to another job recently so left my dozen Alphas and don't have
access to man pages on them :( Have you tried compiling with
HAVE_INT_TIMEZONE disabled?