Re: A Windows x64 port of PostgreSQL - Mailing list pgsql-hackers

From Mark Mielke
Subject Re: A Windows x64 port of PostgreSQL
Date
Msg-id 486C7D95.5010306@mark.mielke.cc
Whole thread Raw
In response to Re: A Windows x64 port of PostgreSQL  ("Ken Camann" <kjcamann@gmail.com>)
List pgsql-hackers
A bit long - the summary is that "intptr_t" should probably be used, 
assuming I understand the problem this thread is talking about:

Ken Camann wrote:
> 1. An object in memory can have size "Size" (= size_t).  So its big
> (maybe 8 bytes).
> 2. An index into the buffer containing that object has index "Index"
> (= int)  So its smaller (maybe 4 bytes).  Now you can't index your big
> object, unless sizeof(size_t) = sizeof(int).  But sizeof(size_t) must
> be at least 8 bytes on just about any 64-bit system.  And sizeof(int)
> is still 4 most of the time, right

I believe one of the mistakes here is an assumption that "int" is always 
the correct type to use for an index. This is not correct. "int" will be 
a type that is probably the most efficient word size for the target 
machine, and since "int" is usually ~32 bits these days, it will have a 
range that is sufficient for most common operations, therefore, it is 
commonly used. But, the C and C++ specifications do not define that an 
index into an array is of type "int". Rather, they defined E1[E2] as 
*((E1) + (E2)), and then the + operator is defined such that if one 
operand E1 is a pointer and operand E2 is an integer type, the result 
will be a pointer to the E2th element of E1 with the same pointer type 
as E1. "integer type" is not "int". It is any integer type. If the 
useful range of the array is 256 values, a "char" is acceptable for use 
as a "char" is an integer type. The optimizer might promote the "char" 
to a 32-bit or 64-bit machine register before calculating the result of 
the addition, but this is irrelevant to the definition of the C language.

I think one could successfully argue that ptrdiff_t is the correct value 
to use for an array index that might use a range larger than "int" on a 
machine where sizeof(int) < sizeof(void*). ptrdiff_t represents the 
difference between two pointers. If P and Q are void* and I is 
ptrdiff_t, and Q - P = I, then &P[I] = Q. Though, I think it might be 
easier to use size_t. If I is of type size_t, and P = malloc(I), then 
P[0] ... P[I-1] are guaranteed to be addressable using a size_t.

There is also the usable range, even on a machine with sizeof(size_t) of 
64 bits. I don't think any existing machine can actually address 64-bits 
worth of continuous memory. 48-bits perhaps. Technically, sizeof(size_t) 
does not need to be sizeof(void*), and in fact, the C standard has this 
to say: "The types used for size_t and ptrdiff_t should not have an 
integer conversion rank greater than that of signed long int unless the 
implementation supports objects large enough to make this necessary." It 
doesn't define sizeof(size_t) in terms of sizeof(void*).

The C standard defines long int as:
"Their implementation-defined values shall be equal or greater in 
magnitude (absolute value) to those shown, with the same sign.
...
— minimum value for an object of type long int
LONG_MIN -2147483647 // −(2**31 − 1)
— maximum value for an object of type long int
LONG_MAX +2147483647 // 2**31 − 1"

Based upon this definition, it appears that Windows 64 is compatible 
with the standard. That GCC took a different route that is also 
compatible with the standard is inconvenient, but a reality that should 
be dealt with.

More comments from the C standard on this issue: "Any pointer type may 
be converted to an integer type. Except as previously specified, the 
result is implementation-defined. If the result cannot be represented in 
the integer type, the behavior is undefined. The result need not be in 
the range of values of any integer type."

The "portable" answer to this problem, is supposed to be intptr_t:
"7.18.1.4 Integer types capable of holding object pointers
The following type designates a signed integer type with the property 
that any valid
pointer to void can be converted to this type, then converted back to 
pointer to void,
and the result will compare equal to the original pointer:
intptr_t
The following type designates an unsigned integer type with the property 
that any valid
pointer to void can be converted to this type, then converted back to 
pointer to void,
and the result will compare equal to the original pointer:
uintptr_t
These types are optional."

If Windows 64 has this type (not sure - I don't use Windows 64), then I 
believe intptr_t is the portable way to solve this problem. Note, 
though, that intptr_t does not guarantee that it can hold every integer 
value. For example, on a 32-bit platform, sizeof(intptr_t) might be 32 
bits, and sizeof(long long) might be 64 bits. There is also this 
portable type:
" 7.18.1.5 Greatest-width integer types
The following type designates a signed integer type capable of 
representing any value of
any signed integer type:
intmax_t
The following type designates an unsigned integer type capable of 
representing any value
of any unsigned integer type:
uintmax_t
These types are required."

I think this means that if PostgreSQL were to be designed to support all 
ISO C compliant platforms, PostgreSQL would have to use a union of 
intptr_t and intmax_t. Or, PostgreSQL will choose to not support some 
platforms. Windows 64 seems as if it may continue to be as popular as 
Windows 32, and should probably be supported.

Cheers,
mark

-- 
Mark Mielke <mark@mielke.cc>



pgsql-hackers by date:

Previous
From: Teodor Sigaev
Date:
Subject: Re: PATCH: CITEXT 2.0
Next
From: Teodor Sigaev
Date:
Subject: Re: PATCH: CITEXT 2.0