Thread: hiding variable-length fields from Form_pg_* structs

hiding variable-length fields from Form_pg_* structs

From
Peter Eisentraut
Date:
It would be helpful if variable length catalog fields (except the first
one) would not be visible on the C level in the Form_pg_* structs.  We
keep them listed in the include/catalog/pg_*.h files so that the BKI
generating code can see them and for general documentation, but the
fields are meaningless in C, and some catalog files contain free-form
comments advising the reader of that.  If we could hide them somehow, we
would avoid random programming bugs, deconfuse compilers trying to
generate useful warnings, and save occasional stack space.  There are
several known cases of the first and second issue, at least.

I haven't found the ideal way to implement that yet, but the general
idea would be something like:

CATALOG(pg_attribute,1249) BKI_BOOTSTRAP ...
{   ...   int4        attinhcount;   Oid         attcollation;   aclitem     attacl[1];
CATVARLEN(   text        attoptions[1];   text        attfdwoptions[1];
)
} FormData_pg_attribute;

where CATVARLEN is defined empty in C, and ignored in the BKI generation
code.

The real trick is to find something that handles well with pgindent and
indenting text editors.

Ideas?




Re: hiding variable-length fields from Form_pg_* structs

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> CATALOG(pg_attribute,1249) BKI_BOOTSTRAP ...
> {
>     ...
>     int4        attinhcount;
>     Oid         attcollation;
>     aclitem     attacl[1];
> CATVARLEN(
>     text        attoptions[1];
>     text        attfdwoptions[1];
> )
> } FormData_pg_attribute;

> where CATVARLEN is defined empty in C, and ignored in the BKI generation
> code.

> The real trick is to find something that handles well with pgindent and
> indenting text editors.

The low-tech way would be

CATALOG(pg_attribute,1249) BKI_BOOTSTRAP ...
{   ...   int4        attinhcount;   Oid         attcollation;   aclitem     attacl[1];
#ifdef CATALOG_VARLEN_FIELDS   text        attoptions[1];   text        attfdwoptions[1];
#endif
} FormData_pg_attribute;
        regards, tom lane


Re: hiding variable-length fields from Form_pg_* structs

From
Peter Eisentraut
Date:
On sön, 2011-11-27 at 18:20 -0500, Tom Lane wrote:
> The low-tech way would be
> 
> CATALOG(pg_attribute,1249) BKI_BOOTSTRAP ...
> {
>     ...
>     int4        attinhcount;
>     Oid         attcollation;
>     aclitem     attacl[1];
> #ifdef CATALOG_VARLEN_FIELDS
>     text        attoptions[1];
>     text        attfdwoptions[1];
> #endif
> } FormData_pg_attribute;

Good enough.

To clarify, I believe the rule is that the first variable-length field
can be accessed as a struct field.  Are there any exceptions to this?
This kind of comment is pretty confusing:

CATALOG(pg_rewrite,2618)
{   NameData    rulename;   Oid         ev_class;   int2        ev_attr;   char        ev_type;   char
ev_enabled;  bool        is_instead;
 
   /* NB: remaining fields must be accessed via heap_getattr */   pg_node_tree ev_qual;   pg_node_tree ev_action;
} FormData_pg_rewrite;

Also, this code in relcache.c accesses indclass, which is after an
int2vector and an oidvector field:
   /* Check to see if it is a unique, non-partial btree index on OID */   if (index->indnatts == 1 &&
index->indisunique&& index->indimmediate &&       index->indkey.values[0] == ObjectIdAttributeNumber &&
index->indclass.values[0]== OID_BTREE_OPS_OID &&       heap_attisnull(htup, Anum_pg_index_indpred))       oidIndex =
index->indexrelid;



Re: hiding variable-length fields from Form_pg_* structs

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> To clarify, I believe the rule is that the first variable-length field
> can be accessed as a struct field.  Are there any exceptions to this?

If it is known not null, yes, but I wonder just how many places actually
depend on that.  It might be better to remove all varlena fields from C
visibility and require use of the accessor functions.  We should at
least look into what that would cost us.

> Also, this code in relcache.c accesses indclass, which is after an
> int2vector and an oidvector field:

>     /* Check to see if it is a unique, non-partial btree index on OID */
>     if (index->indnatts == 1 &&
>         index->indisunique && index->indimmediate &&
>         index->indkey.values[0] == ObjectIdAttributeNumber &&
>         index->indclass.values[0] == OID_BTREE_OPS_OID &&
>         heap_attisnull(htup, Anum_pg_index_indpred))
>         oidIndex = index->indexrelid;

Hmm, that does look mighty broken, doesn't it ... but somehow it works,
else GetNewOid would be bleating all the time.  (Thinks about that for
a bit)  Oh, it accidentally fails to fail because the C declarations
for int2vector and oidvector are actually correct if there is a single
element in the arrays, which we already verified with the indnatts test.
But yeah, this seems horribly fragile, and it should not be performance
critical because we only go through here when loading up a cache entry.
So let's change it.
        regards, tom lane


Re: hiding variable-length fields from Form_pg_* structs

From
Robert Haas
Date:
On Mon, Dec 5, 2011 at 2:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
>> To clarify, I believe the rule is that the first variable-length field
>> can be accessed as a struct field.  Are there any exceptions to this?
>
> If it is known not null, yes, but I wonder just how many places actually
> depend on that.  It might be better to remove all varlena fields from C
> visibility and require use of the accessor functions.  We should at
> least look into what that would cost us.

My impression is that all the varlena fields also allow nulls.  So I
think there's no point in trying to keep the first one C-accessible.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: hiding variable-length fields from Form_pg_* structs

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Dec 5, 2011 at 2:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Peter Eisentraut <peter_e@gmx.net> writes:
>>> To clarify, I believe the rule is that the first variable-length field
>>> can be accessed as a struct field.  Are there any exceptions to this?

>> If it is known not null, yes, but I wonder just how many places actually
>> depend on that.

> My impression is that all the varlena fields also allow nulls.

See MARKNOTNULL in bootstrap.c.  I think the exceptions were designed to
protect direct accesses to pg_index.
        regards, tom lane


Re: hiding variable-length fields from Form_pg_* structs

From
Robert Haas
Date:
On Mon, Dec 5, 2011 at 3:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Mon, Dec 5, 2011 at 2:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Peter Eisentraut <peter_e@gmx.net> writes:
>>>> To clarify, I believe the rule is that the first variable-length field
>>>> can be accessed as a struct field.  Are there any exceptions to this?
>
>>> If it is known not null, yes, but I wonder just how many places actually
>>> depend on that.
>
>> My impression is that all the varlena fields also allow nulls.
>
> See MARKNOTNULL in bootstrap.c.  I think the exceptions were designed to
> protect direct accesses to pg_index.

Hmm, OK.

rhaas=# select r.relname, a.attname, a.atttypid::regtype from pg_class
r, pg_attribute a where relnamespace=11 and relkind='r' and attrelid =
r.oid and a.attnotnull and a.attlen<0; relname   |   attname    |  atttypid
------------+--------------+------------pg_proc    | proargtypes  | oidvectorpg_index   | indkey       |
int2vectorpg_index  | indcollation | oidvectorpg_index   | indclass     | oidvectorpg_index   | indoption    |
int2vectorpg_trigger| tgattr       | int2vector 
(6 rows)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: hiding variable-length fields from Form_pg_* structs

From
Peter Eisentraut
Date:
So here is a patch for that.

There are a few cases that break when hiding all variable length fields:

Access to indclass in relcache.c, as discussed upthread, which should be
fixed.

Access to pg_largeobject.data.  This is apparently OK, per comment in
inv_api.c.

Access to pg_proc.proargtypes in various places.  This is clearly
useful, so we'll keep it visible.

So I think the relcache.c thing should be fixed and then this might be
good to go.

Attachment

Re: hiding variable-length fields from Form_pg_* structs

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> So I think the relcache.c thing should be fixed and then this might be
> good to go.

Cosmetic gripes: I think we could get rid of the various comments that
say things like "variable length fields start here", since the #ifdef
CATALOG_VARLEN lines now represent that in a standardized fashion.
Possibly those lines should be

#ifdef CATALOG_VARLEN        /* variable-length fields start here */

to be even clearer.

What would be appropriate to add instead of those inconsistently-used
comments is explicit comments about the exception cases such as
proargtypes, to make it clear that the placement of the #ifdef
CATALOG_VARLEN is intentional and not a bug in those cases.
        regards, tom lane


Re: hiding variable-length fields from Form_pg_* structs

From
Peter Eisentraut
Date:
On mån, 2012-01-09 at 14:04 -0500, Tom Lane wrote:
> #ifdef CATALOG_VARLEN           /* variable-length fields start here
> */
>
> to be even clearer.
>
> What would be appropriate to add instead of those inconsistently-used
> comments is explicit comments about the exception cases such as
> proargtypes, to make it clear that the placement of the #ifdef
> CATALOG_VARLEN is intentional and not a bug in those cases.
>
I implemented your suggestions in the attached patch.

Attachment

Re: hiding variable-length fields from Form_pg_* structs

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> On mån, 2012-01-09 at 14:04 -0500, Tom Lane wrote:
>> #ifdef CATALOG_VARLEN           /* variable-length fields start here
>> */
>> 
>> to be even clearer.
>> 
>> What would be appropriate to add instead of those inconsistently-used
>> comments is explicit comments about the exception cases such as
>> proargtypes, to make it clear that the placement of the #ifdef
>> CATALOG_VARLEN is intentional and not a bug in those cases.

> I implemented your suggestions in the attached patch.

This looks ready to go to me, except for one trivial nit: in
pg_extension.h, please keep the comment pointing out that extversion
should never be null.
        regards, tom lane