SQL/JSON functions vs. ECPG vs. STRING as a reserved word - Mailing list pgsql-hackers

From Tom Lane
Subject SQL/JSON functions vs. ECPG vs. STRING as a reserved word
Date
Msg-id 3661437.1653855582@sss.pgh.pa.us
Whole thread Raw
Responses Re: SQL/JSON functions vs. ECPG vs. STRING as a reserved word  (Michael Meskes <meskes@postgresql.org>)
Re: SQL/JSON functions vs. ECPG vs. STRING as a reserved word  (Andrew Dunstan <andrew@dunslane.net>)
List pgsql-hackers
Commit 1a36bc9db (SQL/JSON query functions) introduced STRING as a
type_func_name_keyword.  As per the complaint at [1], this broke use
of "string" as a table name, column name, function parameter name, etc.
The restriction about column names, in particular, seems likely to
break boatloads of applications --- wouldn't you think that "string"
is a pretty likely column name?  We have no cover to claim "the SQL
standard says so", either, because they list STRING as an unreserved
keyword.

This is trivial enough to fix so far as the core grammar is concerned;
it works to just change STRING to be an unreserved_keyword.  However,
various ECPG tests fall over, so I surmise that somebody felt it was
okay to break potentially thousands of applications in order to avoid
touching ECPG.  I do not think that's an acceptable trade-off.

I poked into it a bit and found the proximate cause: ECPG uses
ECPGColLabelCommon to represent user-chosen type names in some
places, and that production *does not include unreserved_keyword*.
(Sure enough, the failing test cases try to use "string" as a
type name in a variable declaration.)  That's a pre-existing
bit of awfulness, and it's indeed pretty awful, because it means
that any time we add a new keyword --- even a fully unreserved one
--- we risk breaking somebody's ECPG application.  We just hadn't
happened to add any keywords to date that conflicted with type names
used in the ECPG test suite.

I looked briefly at whether we could improve that situation.
Two of the four uses of ECPGColLabelCommon in ecpg.trailer can be
converted to the more general ECPGColLabel without difficulty,
but trying to change either of the uses in var_type results in
literally thousands of shift/reduce and reduce/reduce conflicts.
This seems to be because what follows ecpgstart can be either a general
SQL statement or an ECPGVarDeclaration (beginning with var_type),
and bison isn't smart enough to disambiguate.  I have a feeling that
this situation could be improved with enough elbow grease, because
plpgsql manages to solve a closely-related problem in allowing its
assignment statements to coexist with general SQL statements.  But
I don't have the interest to tackle it personally, and anyway any
fix would probably be more invasive than we want to put in post-beta.

I also wondered briefly about just changing the affected test cases,
reasoning that breaking some ECPG applications that use "string" is
less bad than breaking everybody else that uses "string".  But type
"string" is apparently a standard ECPG and/or INFORMIX thing, so we
probably can't get away with that.

Hence, I propose the attached, which fixes the easily-fixable
ECPGColLabelCommon uses and adds a hard-wired special case for
STRING to handle the var_type usage.

More generally, I feel like we have a process problem: there needs to
be a higher bar to adding new fully- or even partially-reserved words.
I might've missed it, but I don't recall that there was any discussion
of the compatibility implications of this change.

            regards, tom lane

[1]
https://www.postgresql.org/message-id/PAXPR02MB760049C92DFC2D8B5E8B8F5AE3DA9%40PAXPR02MB7600.eurprd02.prod.outlook.com

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 989db0dbec..969c9c158f 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -17940,6 +17940,7 @@ unreserved_keyword:
             | STORAGE
             | STORED
             | STRICT_P
+            | STRING
             | STRIP_P
             | SUBSCRIPTION
             | SUPPORT
@@ -18098,7 +18099,6 @@ type_func_name_keyword:
             | OVERLAPS
             | RIGHT
             | SIMILAR
-            | STRING
             | TABLESAMPLE
             | VERBOSE
         ;
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 8a2ab405a2..ae35f03251 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -426,7 +426,7 @@ PG_KEYWORD("stdout", STDOUT, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("storage", STORAGE, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("stored", STORED, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("strict", STRICT_P, UNRESERVED_KEYWORD, BARE_LABEL)
-PG_KEYWORD("string", STRING, TYPE_FUNC_NAME_KEYWORD, BARE_LABEL)
+PG_KEYWORD("string", STRING, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("strip", STRIP_P, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("subscription", SUBSCRIPTION, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("substring", SUBSTRING, COL_NAME_KEYWORD, BARE_LABEL)
diff --git a/src/interfaces/ecpg/preproc/ecpg.trailer b/src/interfaces/ecpg/preproc/ecpg.trailer
index daf979a8e8..b95fc44314 100644
--- a/src/interfaces/ecpg/preproc/ecpg.trailer
+++ b/src/interfaces/ecpg/preproc/ecpg.trailer
@@ -467,7 +467,7 @@ type_declaration: S_TYPEDEF
         /* an initializer specified */
         initializer = 0;
     }
-    var_type opt_pointer ECPGColLabelCommon opt_array_bounds ';'
+    var_type opt_pointer ECPGColLabel opt_array_bounds ';'
     {
         add_typedef($5, $6.index1, $6.index2, $3.type_enum, $3.type_dimension, $3.type_index, initializer, *$4 ? 1 :
0);

@@ -701,6 +701,43 @@ var_type:    simple_type
                 struct_member_list[struct_level] = ECPGstruct_member_dup(this->struct_member_list);
             }
         }
+        | STRING
+        {
+            /*
+             * It's quite horrid that ECPGColLabelCommon excludes
+             * unreserved_keyword, meaning that unreserved keywords can't be
+             * used as type names in var_type.  However, this is hard to avoid
+             * since what follows ecpgstart can be either a random SQL
+             * statement or an ECPGVarDeclaration (beginning with var_type).
+             * Pending a bright idea about how to fix that, we must
+             * special-case STRING (and any other unreserved keywords that are
+             * likely to be needed here).
+             */
+            if (INFORMIX_MODE)
+            {
+                $$.type_enum = ECPGt_string;
+                $$.type_str = mm_strdup("char");
+                $$.type_dimension = mm_strdup("-1");
+                $$.type_index = mm_strdup("-1");
+                $$.type_sizeof = NULL;
+            }
+            else
+            {
+                /* this is for typedef'ed types */
+                struct typedefs *this = get_typedef("string");
+
+                $$.type_str = (this->type->type_enum == ECPGt_varchar || this->type->type_enum == ECPGt_bytea) ? EMPTY
:mm_strdup(this->name); 
+                $$.type_enum = this->type->type_enum;
+                $$.type_dimension = this->type->type_dimension;
+                $$.type_index = this->type->type_index;
+                if (this->type->type_sizeof && strlen(this->type->type_sizeof) != 0)
+                    $$.type_sizeof = this->type->type_sizeof;
+                else
+                    $$.type_sizeof = cat_str(3, mm_strdup("sizeof("), mm_strdup(this->name), mm_strdup(")"));
+
+                struct_member_list[struct_level] = ECPGstruct_member_dup(this->struct_member_list);
+            }
+        }
         | s_struct_union_symbol
         {
             /* this is for named structs/unions */
@@ -1342,7 +1379,7 @@ ECPGTypedef: TYPE_P
             /* an initializer specified */
             initializer = 0;
         }
-        ECPGColLabelCommon IS var_type opt_array_bounds opt_reference
+        ECPGColLabel IS var_type opt_array_bounds opt_reference
         {
             add_typedef($3, $6.index1, $6.index2, $5.type_enum, $5.type_dimension, $5.type_index, initializer, *$7 ? 1
:0); 


pgsql-hackers by date:

Previous
From: Ranier Vilela
Date:
Subject: Re: Improving connection scalability (src/backend/storage/ipc/procarray.c)
Next
From: Andres Freund
Date:
Subject: Re: SLRUs in the main buffer pool, redux