Re: Summary of some postgres portability issues - Mailing list pgsql-hackers

From Ken Camann
Subject Re: Summary of some postgres portability issues
Date
Msg-id 63c05a820807091203j66d4b42fofc1d61bd3f8d3fb9@mail.gmail.com
Whole thread Raw
In response to Re: Summary of some postgres portability issues  (Martijn van Oosterhout <kleptog@svana.org>)
Responses Re: Summary of some postgres portability issues  (Martijn van Oosterhout <kleptog@svana.org>)
List pgsql-hackers
On Wed, Jul 9, 2008 at 3:35 AM, Martijn van Oosterhout
<kleptog@svana.org> wrote:

> Just clarifying for myself: you are mostly listing theoretical problems
> here, not actual "I ran it and got regression failures" problems, right?

Correct.  This is why most of them point out that they are not
actually real problems, just written in a less than ideal way.  Like
all warnings, they might be a problem, and they might not.  Some of
them almost certainly would cause problems, like any time it is
assumed that a "long" it large enough to hold a memory address.   Many
of them are completely innocent, and it is clear that they are (this
is true with most of them, because they occur in string operations).
Unfortunately they generate the same warning whether they're bad or
not, and its quite hard to look through the compiler output and make
sense of it.  I can't really make a mental note that I "already
checked that one and it seemed find" because there are 400 of them in
just the main postgres daemon alone.  But its hard to try to find the
bad ones with so many of them (most not problems) on the screen at
once.  I'd fix them, but would anyone be willing to commit that?  I
basically have to fix them anyway, just so I don't have to look at
them.  I can't disable the warning, because I need to see it in the
instances where it is important.  I included them in the summary not
because they are all problems which cause failure, but because I felt
they belonged in a summary of all data-model-related portability
issues.  Note that instances where it is and is not important is
decidable only by a human.  No one is going to input a user name
bigger than 2^32 characters.  You may not truncate a pointer.  These
are the exact same warning (truncating an integer).

> You spend some time arguing that long is the wrong type for lengths in
> memory but since all Datums in postgres are limited to 1GB I don't
> understand how this can be a practical problem since that can be stored
> in an int and a long on any platform.

No, I am not trying to argue this (or at least not from a design
standpoint).  Here is something that probably should have been
obvious, but perhaps is not: I do not know very much about postgres,
or how it works.  I also don't know very much about database
theory/algorithms/programming either.  I am also not interested in
adding new functionality to the system.  I'm just a developer who
wants to use a UDF in a 64-bit DLL with a free database, for a totally
different project that I am working on.  MySQL and Firebird are the
only two databases in which you can currently do this.  I like
postgres a lot more than MySQL and Firebird, plus MySQL crashes on my
very large datasets and Firebird has numerous features missing that I
would like.  The easiest thing (although that is becoming less true)
would be get to postgres to compile as a native 64 bit application.  I
honestly thought it would be a lot easier than I think it will be now.If this continues to be such a problem, I will
needto move back to
 
MySQL and hope they fix the bugs.

I know that a Datum cannot be bigger than 1 GB either way, but the
documentation around the Datum typedef notes that Datum must large
enough to hold a pointer.  It does not say why, or where this
assumption gets used, or why it was made.  It's simply a warning that
somewhere, in the very large codebase, this may happen.  Where does it
happen?  Does it ever happen?  I really, really wish I knew or that
someone could just tell me.  Stuff like that is basically the reason
that I post to this mailing list.  My assumption is though, that this
code is old and that no one really remembers all the semantics
surrounding all the uses of it.  Are the regression tests so strong
that I can not even worry about it?  That just seems like a very bad
idea, especially since the documentation warns me that "this _will_
happen."

As for what would replace it, I think intptr_t.  This type has the
same size as long on LP32, ILP32, LP64, and ILP64 so there would be no
changes to anything that already works, plus this type can hold a
pointer on LLP64 compilers.  This is a change I have already made.
Then I found several references to the assumption that Datum has long
alignment, from code that produce no warnings.  This would almost
certainly create problems.  It also means I would have to look at
every use of Datum to find stuff like this.  I must also understand it
in full, even when it has little or no documentation.  The story is
similar with long, where it is assumed it can only hold memory
addresses, has pointer alignment, in other places.  As I read this
stuff, I become more and more annoyed with Microsoft's decision to
make an LLP64 compiler in the first place.  To not have the larger of
the two basic integer types be able to hold a pointer is just
ridiculous.  However, despite the embrace of gcc and Linux by the
culture, I believe by the numbers MSVC is the most common C/C++
compiler and by a significant margin.  I would get rid of it, it if
only I could.  The project that I need the database for is for a
corporation which is not interested in switching.

> Mostly you seem to be noting that whatever compiler you are using is
> much stricter than the other compilers used in the buildfarm. Clearly
> neither icc nor sun studio find these problems on other 64-bit
> platforms.

I am not familiar with these compilers, but I believe neither icc or
sun studio should have these problems.  They are both LP64, so the
assumption that long can always hold a pointer is correct.  LP64
should still have all the string warnings that do not matter, but they
could just be turned off.  Any potential real problem is unique to
LLP64, a data model never supported by postgres.

> I don't understand what you mean here: the Datum type has very clear
> rules about how it is stored. It is essentially opaque, but given the
> typlen you have enough information to know how to copy it for example.

Well, that is some good news.  Where can I find these rules?


pgsql-hackers by date:

Previous
From: Abhijit Menon-Sen
Date:
Subject: Re: Extending grant insert on tables to sequences
Next
From: Alvaro Herrera
Date:
Subject: Re: Extending grant insert on tables to sequences