Re: pgsql: Allow opclasses to provide tri-valued GIN consistent functions. - Mailing list pgsql-committers

From Heikki Linnakangas
Subject Re: pgsql: Allow opclasses to provide tri-valued GIN consistent functions.
Date
Msg-id 532EE025.30109@vmware.com
Whole thread Raw
In response to Re: pgsql: Allow opclasses to provide tri-valued GIN consistent functions.  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: pgsql: Allow opclasses to provide tri-valued GIN consistent functions.  (Andrew Dunstan <andrew@dunslane.net>)
Re: pgsql: Allow opclasses to provide tri-valued GIN consistent functions.  (Greg Stark <stark@mit.edu>)
List pgsql-committers
On 03/22/2014 03:33 PM, Andrew Dunstan wrote:
>
> On 03/21/2014 06:43 PM, Heikki Linnakangas wrote:
>> On 03/21/2014 10:47 PM, Andres Freund wrote:
>>> On 2014-03-21 17:37:35 -0400, Tom Lane wrote:
>>>> Andres Freund <andres@2ndquadrant.com> writes:
>>>>> I think the GinLogicValueEnum is supposed to be an enum's name, not a
>>>>> variable name, right?
>>>>
>>>> I think the whole thing is too cute by half.  Why isn't it just
>>>>
>>>> typedef enum GinLogicValue
>>>> {
>>>>       GIN_FALSE = 0,           /* item is present / matches */
>>>>       GIN_TRUE = 1,            /* item is not present / does not
>>>> match */
>>>>       GIN_MAYBE = 2            /* don't know if item is present /
>>>> don't know if
>>>>                                 * matches */
>>>> } GinLogicValue;
>>>>
>>>> instead of thinking that we are smarter than the compiler about how
>>>> to store the enum?
>>>
>>> It seems to be a memory only type, so using anything but the raw enum
>>> type seems odd. If it were ondisk alignment stuff could make it
>>> advantageous, but this way...
>>
>> That enum is used in the "check" arrays that are passed around GIN,
>> with one element per index term being searched. I'd really like to
>> keep it small, because the can be hundreds of elements long, and if
>> the compiler decides to store it as a 4- or 8-byte value instead of
>> one byte, it starts to add up.
>>
>> Besides memory usage, it's convenient that an array of GinLogicValues
>> is compatible with an array of booleans, as long as there are no
>> GIN_MAYBE values in it.
>>
>> I committed your original fix to make it an enum type, like it was
>> supposed to be. Thanks!
>
> I don't think there's any guarantee it's only going to be one byte. The
> ANSI C standard says:
>
>      Each enumerated type shall be compatible with char, a signed integer
>      type, or an unsigned integer type. The choice of type is
>      implementation-defined. (6.7.2.2 Enumerationspecifiers)

Here's the code from gin.h (after fixing the issues Andres and Tom
pointed out):

> enum GinLogicValueEnum
> {
>         GIN_FALSE = 0,                  /* item is not present / does not match */
>         GIN_TRUE = 1,                   /* item is present / matches */
>         GIN_MAYBE = 2                   /* don't know if item is present / don't know if
>                                                          * matches */
> };
>
> typedef char GinLogicValue;

The reason for the typedef is precisely that an enum is not guaranteed
to be one byte. Tom suggested getting rid of the typedef, but it's
needed to make sure it's stored as one byte.

I'll go add a comment to it, explaining why it's needed.

- Heikki


pgsql-committers by date:

Previous
From: Noah Misch
Date:
Subject: pgsql: Don't test xmin/xmax columns of a postgres_fdw foreign table.
Next
From: Andrew Dunstan
Date:
Subject: Re: pgsql: Allow opclasses to provide tri-valued GIN consistent functions.