Re: More pgindent follies - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: More pgindent follies
Date
Msg-id 200105220122.f4M1MqB21546@candle.pha.pa.us
Whole thread Raw
In response to More pgindent follies  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: More pgindent follies  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
OK, I have fixes for all of these.  BSD indent clearly has some bugs,
and ironically, the indent source code is quite ugly.  The best way to
work around them is with pre/post processing, which we already do.

I just checked FreeBSD and NetBSD and they don't seem to have added too
much to BSD indent it since it left Berkeley.  They have upped the
keyword limit to 1000, but that is still too small for us, and I don't
see any other major work being done.

Looks like GNU indent development was picked up in 1999 again, so it may
be worth another look:

-r--r--r--   1 34       21        160411 Feb  3  1994 indent-1.9.1.tar.gz
-r--r--r--   1 34       21        143307 May 27  1999 indent-1.10.0.tar.gz
-r--r--r--   1 34       21        183160 Jul  4  1999 indent-2.1.0.tar.gz
-r--r--r--   1 34       21        183999 Jul 15  1999 indent-2.1.1.tar.gz
-r--r--r--   1 34       21        186287 Jul 24  1999 indent-2.2.0.tar.gz
-r--r--r--   1 34       21        191303 Sep 25  1999 indent-2.2.1.tar.gz
-r--r--r--   1 34       21        192872 Sep 28  1999 indent-2.2.2.tar.gz
-r--r--r--   1 34       21        198319 Oct  1  1999 indent-2.2.3.tar.gz
-r--r--r--   1 34       21        197332 Nov  4  1999 indent-2.2.4.tar.gz
-r--r--r--   1 34       21        203764 Jan 16  2000 indent-2.2.5.tar.gz
-r--r--r--   1 34       21        222510 Nov 17  2000 indent-2.2.6.tar.gz

OK, here are my comments.


> As long as you're fixing bugs in pgindent, here are some more follies
> exhibited by the most recent pgindent run.  Some of these bugs have
> been there for awhile, but at least one (the removal of space before
> a same-line comment) seems to be new as of the most recent run.
> 

Glad to fix these.  I hacked together pgindent just to match what I
thought looked good, so if people have things that look bad, let me
know.


> The examples are all taken from 
> 
> $ cd .../src/backend/utils/adt/
> $ cvs diff -c -r1.85 -r1.86 selfuncs.c

Great that you found them all in one file, or bad that they all existed
in one file.  :-)


> but there are many similar cases elsewhere.
> 
> 1. I can sort of understand inserting space before an #ifdef (though I do
> not agree with it), but why before #endif?  Why does the first example
> insert a blank line *only* before the #endif and not its matching #if?
> If you're trying to set off the #if block from surrounding code, seems
> like a blank line *after* the #endif would help, not one before.  But
> none of the here-inserted blank lines seem to me to improve readability;
> I'd vote for not making any such changes.
> 
> ***************
> *** 648,653 ****
> --- 653,659 ----
>   {
>   #ifdef NOT_USED                    /* see neqjoinsel() before removing me! */
>       Oid            opid = PG_GETARG_OID(0);
> + 
>   #endif
>       Oid            relid1 = PG_GETARG_OID(1);
>       AttrNumber    attno1 = PG_GETARG_INT16(2);
> ***************
> *** 1078,1087 ****
> --- 1091,1102 ----
>   convert_string_datum(Datum value, Oid typid)
>   {
>       char       *val;
> + 
>   #ifdef USE_LOCALE
>       char       *xfrmstr;
>       size_t        xfrmsize;
>       size_t        xfrmlen;
> + 
>   #endif
>   
>       switch (typid)
> ***************

Fixed with 'awk' script to remove blank line above #endif.

> 
> 2. Here are two examples of a long-standing bug: pgindent frequently
> (but not always) mis-indents the first 'case' line(s) of a switch.
> I don't agree with its indentation of block comments between case
> entries, either.
> 
> ***************
> *** 845,892 ****
>       switch (valuetypid)
>       {
>   
> !         /*
> !          * Built-in numeric types
> !          */
> !         case BOOLOID:
> !         case INT2OID:
> !         case INT4OID:
> !         case INT8OID:
> !         case FLOAT4OID:
> !         case FLOAT8OID:
> !         case NUMERICOID:
> !         case OIDOID:
> !         case REGPROCOID:
>               *scaledvalue = convert_numeric_to_scalar(value, valuetypid);
>               *scaledlobound = convert_numeric_to_scalar(lobound, boundstypid);
>               *scaledhibound = convert_numeric_to_scalar(hibound, boundstypid);
>               return true;
>   
> !         /*
> !          * Built-in string types
> !          */
>           case CHAROID:
>           case BPCHAROID:
>           case VARCHAROID:
> --- 851,898 ----
>       switch (valuetypid)
>       {
>   
> !             /*
> !              * Built-in numeric types
> !              */
> !             case BOOLOID:
> !             case INT2OID:
> !             case INT4OID:
> !             case INT8OID:
> !             case FLOAT4OID:
> !             case FLOAT8OID:
> !             case NUMERICOID:
> !             case OIDOID:
> !             case REGPROCOID:
>               *scaledvalue = convert_numeric_to_scalar(value, valuetypid);
>               *scaledlobound = convert_numeric_to_scalar(lobound, boundstypid);
>               *scaledhibound = convert_numeric_to_scalar(hibound, boundstypid);
>               return true;
>   
> !             /*
> !              * Built-in string types
> !              */
>           case CHAROID:
>           case BPCHAROID:
>           case VARCHAROID:
> ***************
> *** 911,917 ****
>   {
>       switch (typid)
>       {
> !         case BOOLOID:
>               return (double) DatumGetBool(value);
>           case INT2OID:
>               return (double) DatumGetInt16(value);
> --- 917,923 ----
>   {
>       switch (typid)
>       {
> !             case BOOLOID:
>               return (double) DatumGetBool(value);
>           case INT2OID:
>               return (double) DatumGetInt16(value);
> ***************

This is actually caused by functions that do not define local variables
but start with switch:

intfunc(int x){    switch (x)    {        case 1:        case 2:

in these cases, case 1 and case2 get indented too far.  They actually
get indented as though they were variable names, which is clearly wrong.
BSD indent has a bug in such cases, so the trick was to add:
int pgindent_func_no_var_fix;

before indent is run, and after remove it and an optional blank line if
it was the only variable definition.

> 
> 3. This is new misbehavior as of the last pgindent run: whitespace
> between a statement and a comment on the same line is sometimes removed.
> 
> ***************
> *** 1120,1126 ****
>   
>   #ifdef USE_LOCALE
>       /* Guess that transformed string is not much bigger than original */
> !     xfrmsize = strlen(val) + 32;        /* arbitrary pad value here... */
>       xfrmstr = (char *) palloc(xfrmsize);
>       xfrmlen = strxfrm(xfrmstr, val, xfrmsize);
>       if (xfrmlen >= xfrmsize)
> --- 1137,1143 ----
>   
>   #ifdef USE_LOCALE
>       /* Guess that transformed string is not much bigger than original */
> !     xfrmsize = strlen(val) + 32;/* arbitrary pad value here... */
>       xfrmstr = (char *) palloc(xfrmsize);
>       xfrmlen = strxfrm(xfrmstr, val, xfrmsize);
>       if (xfrmlen >= xfrmsize)
> ***************


This is happening because it has landed on a tab stop and isn't adding
another one.  I have added a 'sed' line to properly push these to the
next tab stop.

> 
> 4. This breaking of a comment attached to a #define scares me.
> Apparently gcc still thinks the comment is valid, but it seems to me
> that this is making not-very-portable assumptions about the behavior of
> the preprocessor.  I always thought that you needed to use backslashes
> to continue a preprocessor command onto the next line reliably.
> 
> ***************
> *** 1691,1705 ****
>   
>   #define FIXED_CHAR_SEL    0.04    /* about 1/25 */
>   #define CHAR_RANGE_SEL    0.25
> ! #define ANY_CHAR_SEL    0.9        /* not 1, since it won't match end-of-string */
>   #define FULL_WILDCARD_SEL 5.0
>   #define PARTIAL_WILDCARD_SEL 2.0
>   
> --- 1718,1733 ----
>   
>   #define FIXED_CHAR_SEL    0.04    /* about 1/25 */
>   #define CHAR_RANGE_SEL    0.25
> ! #define ANY_CHAR_SEL    0.9        /* not 1, since it won't match
> !                                  * end-of-string */
>   #define FULL_WILDCARD_SEL 5.0
>   #define PARTIAL_WILDCARD_SEL 2.0
>   
> ***************

I don't see the problem here.  My assumption is that the comment is not
part of the define, right?


> 
> 5. Here's an interesting one: why was the "return" line misindented?
> I don't particularly agree with the handling of the comments on the
> #else and #endif lines either... they're not even mutually consistent,
> let alone stylistically good.
> 
> ***************
> *** 1904,1912 ****
>       else
>           result = false;
>       return (bool) result;
> ! #else /* not USE_LOCALE */
> !     return true;                /* We must be in C locale, which is OK */
> ! #endif /* USE_LOCALE */
>   }
>   
>   /*
> --- 1935,1943 ----
>       else
>           result = false;
>       return (bool) result;
> ! #else                            /* not USE_LOCALE */
> !                 return true;    /* We must be in C locale, which is OK */
> ! #endif     /* USE_LOCALE */
>   }
>   
>   /*
> ***************

Same cause as #2 (switch/case indent).  Fixed.

I will cvs commit the pgindent changes, and send a diff of selfuncs.c to
patches so people can see the fixes.   Fixing the entire tree will have
to wait for 7.2 beta.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


pgsql-hackers by date:

Previous
From: Barry Lind
Date:
Subject: Re: Plans for solving the VACUUM problem
Next
From: Bruce Momjian
Date:
Subject: Re: New system catalog idea