NetBSD "Bad address" failure (was Re: Third call for platform testing) - Mailing list pgsql-hackers

From Tom Lane
Subject NetBSD "Bad address" failure (was Re: Third call for platform testing)
Date
Msg-id 9179.987210991@sss.pgh.pa.us
Whole thread Raw
In response to Re: [lockhart@alumni.caltech.edu: Third call for platform testing]  (Tom Ivar Helbekkmo <tih@kpnQwest.no>)
Responses Re: NetBSD "Bad address" failure (was Re: Third call for platform testing)  (Tom Lane <tgl@sss.pgh.pa.us>)
re: NetBSD "Bad address" failure (was Re: Third call for platform testing)  (matthew green <mrg@eterna.com.au>)
List pgsql-hackers
Tom Ivar Helbekkmo <tih@kpnQwest.no> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
> CREATE INDEX hash_i4_index ON hash_i4_heap USING hash (random int4_ops);
> + ERROR:  cannot read block 3 of hash_i4_index: Bad address
>> 
>> "Bad address"?  That seems pretty bizarre.

> This is obviously something that shows up on _some_ NetBSD platforms.
> The above was on sparc64, but that same problem is the only one I see
> in the regression testing on NetBSD/vax that isn't just different
> floating point (the VAX doesn't have IEEE), different ordering of
> (unordered) collections or different wording of strerror() output.

> NetBSD/i386 doesn't have the "Bad address" problem.

After looking into it, I find that the problem is this: Postgres, or at
least the hash-index part of it, expects to be able to lseek() to a
position past the end of a file and then get a non-failure return from
read().  (This happens indirectly because it uses ReadBuffer for blocks
that it has never yet written.)  Given the attached test program, I get
this result on my own machine:

$ touch z            -- create an empty file
$ ./a.out z 0            -- read at offset 0
Read 0 bytes
$ ./a.out z 1            -- read at offset 8K
Read 0 bytes

Presumably, the same result appears everywhere else that the regress
tests pass.  But NetBSD 1.5T gives

$ touch z
$ ./a.out z 0
Read 0 bytes
$ ./a.out z 1
read: Bad address
$ uname -a
NetBSD varg.i.eunet.no 1.5T NetBSD 1.5T (VARG) #4: Thu Apr  5 23:38:04 CEST 2001
root@varg.i.eunet.no:/usr/src/sys/arch/vax/compile/VARGvax
 

I think this is indisputably a bug in (some versions of) NetBSD.  If I
can seek past the end of file, read() shouldn't consider it a hard error
to read there --- and in any case, EFAULT isn't a very reasonable error
code to return.  Since it seems not to be a widespread problem, I'm not
eager to change the hash code to try to avoid it.
        regards, tom lane


#include <stdio.h>
#include <errno.h>
#include <fcntl.h>
#include <unistd.h>

int main (int argc, char** argv)
{char *fname = argv[1];int fd, readres;long seekres;char buf[8192];
fd = open(fname, O_RDONLY, 0);if (fd < 0){    perror(fname);    exit(1);}seekres = lseek(fd, atoi(argv[2]) * 8192,
SEEK_SET);if(seekres < 0){    perror("seek");    exit(1);}readres = read(fd, buf, sizeof(buf));if (readres < 0){
perror("read");   exit(1);}printf("Read %d bytes\n", readres);
 
exit(0);
}


pgsql-hackers by date:

Previous
From: Lamar Owen
Date:
Subject: Re: 7.1 RPMs
Next
From: Philip Warner
Date:
Subject: Re: pg_dump ordering problem (rc4)