Re: PostgreSQL for VAX on NetBSD/OpenBSD - Mailing list pgsql-hackers

From David Brownlee
Subject Re: PostgreSQL for VAX on NetBSD/OpenBSD
Date
Msg-id CAGN_6pZ4oxe1LXcGp00j93h-rdrUk7+RxL9HfZ6SXduQf1ge5Q@mail.gmail.com
Whole thread Raw
In response to Re: PostgreSQL for VAX on NetBSD/OpenBSD  (Greg Stark <stark@mit.edu>)
List pgsql-hackers
On 20 August 2015 at 14:54, Greg Stark <stark@mit.edu> wrote:
> On Wed, Jun 25, 2014 at 6:05 PM, John Klos <john@ziaspace.com> wrote:
>> While I wouldn't be surprised if you remove the VAX code because not many
>> people are going to be running PostgreSQL, I'd disagree with the assessment
>> that this port is broken. It compiles, it initializes databases, it runs, et
>> cetera, albeit not with the default postgresql.conf.
>
> So I've been playing with this a bit. I have simh running on my home
> server as a Vax  3900 with NetBSD 6.1.5. My home server was mainly
> intended to be a SAN and its cpu is woefully underpowered so the
> resulting VAX is actually very very slow. So slow I wonder if there's
> a bug in the emulator but anyways.

I've run NetBS/vax in simh on a laptop with a 2.5Ghz i5-2520M and it
feels "reasonably fast or a VAX" (make of that what you will :)

> I'm coming to the conclusion that the port doesn't really work
> practically speaking but the failures are more interesting than I
> expected. They come in a few varieties:

Mmm. edge cases and failing (probably reasonable :) assumptions.

> 1) Vax does not have IEEE fp. This manifests in a few ways, some of
> which may be our own bugs or missing expected outputs. The numeric
> data type maths often produce numbers rounded off differently, the
> floating point tests print numbers one digit shorter than our expected
> results expect and sometimes in scientific notation where we don't
> expect. The overflow tests generate floating point exceptions rather
> than overflows. Infinity and NaN don't work. The Json code in
> particular generates large numbers where +/- Infinity literals are
> supplied.
>
> There are some planner tests that fail with floating point exceptions
> -- that's probably a bug on our part. And I've seen at least one
> server crash (maybe two) apparently caused by one as well which I
> don't believe is expected.

Sounds like some useful test cases there.

> 2) The initdb problem is actually not our fault. It looks like a
> NetBSD kernel bug when allocating large shared memory blocks on a
> machine without lots of memory. There's not much initdb can do with a
> kernel panic...

That should definitely be fixed...

>                        BSD still has the problem of kern.maxfiles defaulting
> to a value low enough that even two connections causes the regression
> tests to run out of file descriptors. That's documented and it would
> be a right pain for initdb to detect that case.

Is initdb calling ulimit() to check/set open files? Its probably worth
it as a sanity check if nothing else.

I think the VAX default open_max is 128. The 'bigger' ports have a
default of 1024, and I think they should probably all be updated to
that, though that is orthogonal to a ulimit() check.

> 3) The tests take so long to run that autovacuum kicks in and the
> tests start producing rows in inconsistent orderings. I assume that's
> a problem we've run into on the CLOBBER_CACHE animals as well?

Can the autovaccum daemon settings be bumped/disabled while running the tests?

> 4) One of the tablesample tests seems to freeze indefinitely. I
> haven't looked into why yet. That might indeed indicate that the
> spinlock code isn't working?
>
> So my conclusion tentatively is that while the port doesn't actually
> work practically speaking it is nevertheless uncovering some
> interesting bugs.

Good to hear. Looks like bugs in both the OS and software side, so fun for all.

Thanks for taking the time to do this!

David



pgsql-hackers by date:

Previous
From: "Shulgin, Oleksandr"
Date:
Subject: Re: deparsing utility commands
Next
From: Corey Huinker
Date:
Subject: Re: Declarative partitioning