Re: Extending outfuncs support to utility statements - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Extending outfuncs support to utility statements
Date
Msg-id 1944074.1663880234@sss.pgh.pa.us
Whole thread Raw
In response to Extending outfuncs support to utility statements  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Extending outfuncs support to utility statements
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> On 2022-09-22 12:48:47 -0400, Tom Lane wrote:
>> After staring at the code a bit, I think we don't need to touch
>> pg_strtok() per se.  I propose that this can be resolved with changes
>> at the next higher level.  Let's make outToken print NULL as <> as
>> it always has, but print an empty string as "" (two double quotes).
>> If the raw input string is two double quotes, print it as \"" to
>> disambiguate.  This'd require a catversion bump when committed,
>> but I don't think there are any showstopper problems otherwise.

> Makes sense to me.

Here is a version of all-but-the-last patch in Peter's series.
I left off the last one because it fails check-world: we now
get through the core regression tests okay, but then the pg_dump
tests fail on the new SQL function.  To fix that, we would have
to extend ruleutils.c's get_utility_query_def() to be able to
fully reconstruct any legal utility query ... which seems like
a pretty dauntingly large amount of tedious manual effort to
start with, and then also a nontrivial additional requirement
on any future patch that adds new utility syntax.  Are we sure
it's worth going there?

But I think it's probably worth committing what we have here
just on testability grounds.

Some notes:

0001, 0002 not changed.

I tweaked 0003 a bit, mainly because I think it's probably not very
safe to apply strncmp to a string we don't know the length of.
It might be difficult to fall off the end of memory that way, but
I wouldn't bet it's impossible.  Also, adding the length checks gets
rid of the need for a grotty order dependency in _readA_Expr().

0004 fixes the empty-string problem as per above.

I did not like what you'd done about imprecise floats one bit.
I think we ought to do it as in 0005 instead: drop all the
hard-wired precision assumptions and just print per Ryu.

0006, 0007, 0008 are basically the same as your previous 0004,
0005, 0006, except for getting rid of the float hacking in 0005.

If you're good with this approach to the float issue, I think
this set is committable (minus 0006 of course, and don't forget
the catversion bump).

            regards, tom lane

From 89def573ced57fc320ad191613dac62c8992c27a Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Fri, 12 Aug 2022 21:15:40 +0200
Subject: [PATCH v2 1/9] Fix reading of most-negative integer value nodes

The main parser checks whether a literal fits into an int when
deciding whether it should be put into an Integer or Float node.  The
parser processes integer literals without signs.  So a most-negative
integer literal will not fit into Integer and will end up as a Float
node.

The node tokenizer did this differently.  It included the sign when
checking whether the literal fit into int.  So a most-negative integer
would indeed fit that way and end up as an Integer node.

In order to preserve the node structure correctly, we need the node
tokenizer to also analyze integer literals without sign.

There are a number of test cases in the regression tests that have a
most-negative integer argument of some utility statement, so this
issue is easily reproduced under WRITE_READ_PARSE_PLAN_TREES.
---
 src/backend/nodes/read.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/nodes/read.c b/src/backend/nodes/read.c
index 4a54996b63..a9cb81b129 100644
--- a/src/backend/nodes/read.c
+++ b/src/backend/nodes/read.c
@@ -267,7 +267,7 @@ nodeTokenType(const char *token, int length)
         char       *endptr;

         errno = 0;
-        (void) strtoint(token, &endptr, 10);
+        (void) strtoint(numptr, &endptr, 10);
         if (endptr != token + length || errno == ERANGE)
             return T_Float;
         return T_Integer;
--
2.37.1

From 8a469d0a7195be4ecc65155b82c5836dfb13b920 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Fri, 12 Aug 2022 21:16:26 +0200
Subject: [PATCH v2 2/9] Fix reading of BitString nodes

The node tokenizer went out of its way to store BitString node values
without the leading 'b'.  But everything else in the system stores the
leading 'b'.  This would break if a BitString node is
read-printed-read.

Also, the node tokenizer didn't know that BitString node tokens could
also start with 'x'.
---
 src/backend/nodes/read.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/src/backend/nodes/read.c b/src/backend/nodes/read.c
index a9cb81b129..fe84f140ee 100644
--- a/src/backend/nodes/read.c
+++ b/src/backend/nodes/read.c
@@ -288,7 +288,7 @@ nodeTokenType(const char *token, int length)
         retval = T_Boolean;
     else if (*token == '"' && length > 1 && token[length - 1] == '"')
         retval = T_String;
-    else if (*token == 'b')
+    else if (*token == 'b' || *token == 'x')
         retval = T_BitString;
     else
         retval = OTHER_TOKEN;
@@ -471,11 +471,10 @@ nodeRead(const char *token, int tok_len)
             break;
         case T_BitString:
             {
-                char       *val = palloc(tok_len);
+                char       *val = palloc(tok_len + 1);

-                /* skip leading 'b' */
-                memcpy(val, token + 1, tok_len - 1);
-                val[tok_len - 1] = '\0';
+                memcpy(val, token, tok_len);
+                val[tok_len] = '\0';
                 result = (Node *) makeBitString(val);
                 break;
             }
--
2.37.1

From b0e5a42805f6e72f2d9c6a8ef693539027fe9c64 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, 22 Sep 2022 15:08:20 -0400
Subject: [PATCH v2 3/9] Add read support for some missing raw parse nodes

The node types A_Const, Constraint, and A_Expr had custom output
functions, but no read functions were implemented so far.

The A_Expr output format had to be tweaked a bit to make it easier to
parse.

Be a bit more cautious about applying strncmp to unterminated strings.

Also error out if an unrecognized enum value is found in each case,
instead of just printing a placeholder value.  That was maybe ok for
debugging but won't work if we want to have robust round-tripping.

diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 60610e3a4b..24ea0487e7 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -548,12 +548,12 @@ _outA_Expr(StringInfo str, const A_Expr *node)
             WRITE_NODE_FIELD(name);
             break;
         case AEXPR_OP_ANY:
-            WRITE_NODE_FIELD(name);
             appendStringInfoString(str, " ANY");
+            WRITE_NODE_FIELD(name);
             break;
         case AEXPR_OP_ALL:
-            WRITE_NODE_FIELD(name);
             appendStringInfoString(str, " ALL");
+            WRITE_NODE_FIELD(name);
             break;
         case AEXPR_DISTINCT:
             appendStringInfoString(str, " DISTINCT");
@@ -600,7 +600,7 @@ _outA_Expr(StringInfo str, const A_Expr *node)
             WRITE_NODE_FIELD(name);
             break;
         default:
-            appendStringInfoString(str, " ??");
+            elog(ERROR, "unrecognized A_Expr_Kind: %d", (int) node->kind);
             break;
     }
 
@@ -782,8 +782,7 @@ _outConstraint(StringInfo str, const Constraint *node)
             break;
 
         default:
-            appendStringInfo(str, "<unrecognized_constraint %d>",
-                             (int) node->contype);
+            elog(ERROR, "unrecognized ConstrType: %d", (int) node->contype);
             break;
     }
 }
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index bee62fc15c..89a9a50f7c 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -270,11 +270,11 @@ _readBoolExpr(void)
     /* do-it-yourself enum representation */
     token = pg_strtok(&length); /* skip :boolop */
     token = pg_strtok(&length); /* get field value */
-    if (strncmp(token, "and", 3) == 0)
+    if (length == 3 && strncmp(token, "and", 3) == 0)
         local_node->boolop = AND_EXPR;
-    else if (strncmp(token, "or", 2) == 0)
+    else if (length == 2 && strncmp(token, "or", 2) == 0)
         local_node->boolop = OR_EXPR;
-    else if (strncmp(token, "not", 3) == 0)
+    else if (length == 3 && strncmp(token, "not", 3) == 0)
         local_node->boolop = NOT_EXPR;
     else
         elog(ERROR, "unrecognized boolop \"%.*s\"", length, token);
@@ -285,6 +285,162 @@ _readBoolExpr(void)
     READ_DONE();
 }
 
+static A_Const *
+_readA_Const(void)
+{
+    READ_LOCALS(A_Const);
+
+    token = pg_strtok(&length);
+    if (length == 4 && strncmp(token, "NULL", 4) == 0)
+        local_node->isnull = true;
+    else
+    {
+        union ValUnion *tmp = nodeRead(NULL, 0);
+
+        memcpy(&local_node->val, tmp, sizeof(*tmp));
+    }
+
+    READ_LOCATION_FIELD(location);
+
+    READ_DONE();
+}
+
+/*
+ * _readConstraint
+ */
+static Constraint *
+_readConstraint(void)
+{
+    READ_LOCALS(Constraint);
+
+    READ_STRING_FIELD(conname);
+    READ_BOOL_FIELD(deferrable);
+    READ_BOOL_FIELD(initdeferred);
+    READ_LOCATION_FIELD(location);
+
+    token = pg_strtok(&length); /* skip :contype */
+    token = pg_strtok(&length); /* get field value */
+    if (length == 4 && strncmp(token, "NULL", 4) == 0)
+        local_node->contype = CONSTR_NULL;
+    else if (length == 8 && strncmp(token, "NOT_NULL", 8) == 0)
+        local_node->contype = CONSTR_NOTNULL;
+    else if (length == 7 && strncmp(token, "DEFAULT", 7) == 0)
+        local_node->contype = CONSTR_DEFAULT;
+    else if (length == 8 && strncmp(token, "IDENTITY", 8) == 0)
+        local_node->contype = CONSTR_IDENTITY;
+    else if (length == 9 && strncmp(token, "GENERATED", 9) == 0)
+        local_node->contype = CONSTR_GENERATED;
+    else if (length == 5 && strncmp(token, "CHECK", 5) == 0)
+        local_node->contype = CONSTR_CHECK;
+    else if (length == 11 && strncmp(token, "PRIMARY_KEY", 11) == 0)
+        local_node->contype = CONSTR_PRIMARY;
+    else if (length == 6 && strncmp(token, "UNIQUE", 6) == 0)
+        local_node->contype = CONSTR_UNIQUE;
+    else if (length == 9 && strncmp(token, "EXCLUSION", 9) == 0)
+        local_node->contype = CONSTR_EXCLUSION;
+    else if (length == 11 && strncmp(token, "FOREIGN_KEY", 11) == 0)
+        local_node->contype = CONSTR_FOREIGN;
+    else if (length == 15 && strncmp(token, "ATTR_DEFERRABLE", 15) == 0)
+        local_node->contype = CONSTR_ATTR_DEFERRABLE;
+    else if (length == 19 && strncmp(token, "ATTR_NOT_DEFERRABLE", 19) == 0)
+        local_node->contype = CONSTR_ATTR_NOT_DEFERRABLE;
+    else if (length == 13 && strncmp(token, "ATTR_DEFERRED", 13) == 0)
+        local_node->contype = CONSTR_ATTR_DEFERRED;
+    else if (length == 14 && strncmp(token, "ATTR_IMMEDIATE", 14) == 0)
+        local_node->contype = CONSTR_ATTR_IMMEDIATE;
+
+    switch (local_node->contype)
+    {
+        case CONSTR_NULL:
+        case CONSTR_NOTNULL:
+            /* no extra fields */
+            break;
+
+        case CONSTR_DEFAULT:
+            READ_NODE_FIELD(raw_expr);
+            READ_STRING_FIELD(cooked_expr);
+            break;
+
+        case CONSTR_IDENTITY:
+            READ_NODE_FIELD(options);
+            READ_CHAR_FIELD(generated_when);
+            break;
+
+        case CONSTR_GENERATED:
+            READ_NODE_FIELD(raw_expr);
+            READ_STRING_FIELD(cooked_expr);
+            READ_CHAR_FIELD(generated_when);
+            break;
+
+        case CONSTR_CHECK:
+            READ_BOOL_FIELD(is_no_inherit);
+            READ_NODE_FIELD(raw_expr);
+            READ_STRING_FIELD(cooked_expr);
+            READ_BOOL_FIELD(skip_validation);
+            READ_BOOL_FIELD(initially_valid);
+            break;
+
+        case CONSTR_PRIMARY:
+            READ_NODE_FIELD(keys);
+            READ_NODE_FIELD(including);
+            READ_NODE_FIELD(options);
+            READ_STRING_FIELD(indexname);
+            READ_STRING_FIELD(indexspace);
+            READ_BOOL_FIELD(reset_default_tblspc);
+            /* access_method and where_clause not currently used */
+            break;
+
+        case CONSTR_UNIQUE:
+            READ_BOOL_FIELD(nulls_not_distinct);
+            READ_NODE_FIELD(keys);
+            READ_NODE_FIELD(including);
+            READ_NODE_FIELD(options);
+            READ_STRING_FIELD(indexname);
+            READ_STRING_FIELD(indexspace);
+            READ_BOOL_FIELD(reset_default_tblspc);
+            /* access_method and where_clause not currently used */
+            break;
+
+        case CONSTR_EXCLUSION:
+            READ_NODE_FIELD(exclusions);
+            READ_NODE_FIELD(including);
+            READ_NODE_FIELD(options);
+            READ_STRING_FIELD(indexname);
+            READ_STRING_FIELD(indexspace);
+            READ_BOOL_FIELD(reset_default_tblspc);
+            READ_STRING_FIELD(access_method);
+            READ_NODE_FIELD(where_clause);
+            break;
+
+        case CONSTR_FOREIGN:
+            READ_NODE_FIELD(pktable);
+            READ_NODE_FIELD(fk_attrs);
+            READ_NODE_FIELD(pk_attrs);
+            READ_CHAR_FIELD(fk_matchtype);
+            READ_CHAR_FIELD(fk_upd_action);
+            READ_CHAR_FIELD(fk_del_action);
+            READ_NODE_FIELD(fk_del_set_cols);
+            READ_NODE_FIELD(old_conpfeqop);
+            READ_OID_FIELD(old_pktable_oid);
+            READ_BOOL_FIELD(skip_validation);
+            READ_BOOL_FIELD(initially_valid);
+            break;
+
+        case CONSTR_ATTR_DEFERRABLE:
+        case CONSTR_ATTR_NOT_DEFERRABLE:
+        case CONSTR_ATTR_DEFERRED:
+        case CONSTR_ATTR_IMMEDIATE:
+            /* no extra fields */
+            break;
+
+        default:
+            elog(ERROR, "unrecognized ConstrType: %d", (int) local_node->contype);
+            break;
+    }
+
+    READ_DONE();
+}
+
 static RangeTblEntry *
 _readRangeTblEntry(void)
 {
@@ -376,6 +532,93 @@ _readRangeTblEntry(void)
     READ_DONE();
 }
 
+static A_Expr *
+_readA_Expr(void)
+{
+    READ_LOCALS(A_Expr);
+
+    token = pg_strtok(&length);
+
+    if (length == 3 && strncmp(token, "ANY", 3) == 0)
+    {
+        local_node->kind = AEXPR_OP_ANY;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 3 && strncmp(token, "ALL", 3) == 0)
+    {
+        local_node->kind = AEXPR_OP_ALL;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 8 && strncmp(token, "DISTINCT", 8) == 0)
+    {
+        local_node->kind = AEXPR_DISTINCT;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 12 && strncmp(token, "NOT_DISTINCT", 12) == 0)
+    {
+        local_node->kind = AEXPR_NOT_DISTINCT;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 6 && strncmp(token, "NULLIF", 6) == 0)
+    {
+        local_node->kind = AEXPR_NULLIF;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 2 && strncmp(token, "IN", 2) == 0)
+    {
+        local_node->kind = AEXPR_IN;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 4 && strncmp(token, "LIKE", 4) == 0)
+    {
+        local_node->kind = AEXPR_LIKE;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 5 && strncmp(token, "ILIKE", 5) == 0)
+    {
+        local_node->kind = AEXPR_ILIKE;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 7 && strncmp(token, "SIMILAR", 7) == 0)
+    {
+        local_node->kind = AEXPR_SIMILAR;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 7 && strncmp(token, "BETWEEN", 7) == 0)
+    {
+        local_node->kind = AEXPR_BETWEEN;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 11 && strncmp(token, "NOT_BETWEEN", 11) == 0)
+    {
+        local_node->kind = AEXPR_NOT_BETWEEN;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 11 && strncmp(token, "BETWEEN_SYM", 11) == 0)
+    {
+        local_node->kind = AEXPR_BETWEEN_SYM;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 15 && strncmp(token, "NOT_BETWEEN_SYM", 15) == 0)
+    {
+        local_node->kind = AEXPR_NOT_BETWEEN_SYM;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 5 && strncmp(token, ":name", 5) == 0)
+    {
+        local_node->kind = AEXPR_OP;
+        local_node->name = nodeRead(NULL, 0);
+    }
+    else
+        elog(ERROR, "unrecognized A_Expr kind: \"%.*s\"", length, token);
+
+    READ_NODE_FIELD(lexpr);
+    READ_NODE_FIELD(rexpr);
+    READ_LOCATION_FIELD(location);
+
+    READ_DONE();
+}
+
 static ExtensibleNode *
 _readExtensibleNode(void)
 {
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 6958306a7d..aead2afd6e 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -291,7 +291,7 @@ typedef enum A_Expr_Kind
 
 typedef struct A_Expr
 {
-    pg_node_attr(custom_read_write, no_read)
+    pg_node_attr(custom_read_write)
 
     NodeTag        type;
     A_Expr_Kind kind;            /* see above */
@@ -319,7 +319,7 @@ union ValUnion
 
 typedef struct A_Const
 {
-    pg_node_attr(custom_copy_equal, custom_read_write, no_read)
+    pg_node_attr(custom_copy_equal, custom_read_write)
 
     NodeTag        type;
     union ValUnion val;
@@ -2332,7 +2332,7 @@ typedef enum ConstrType            /* types of constraints */
 
 typedef struct Constraint
 {
-    pg_node_attr(custom_read_write, no_read)
+    pg_node_attr(custom_read_write)
 
     NodeTag        type;
     ConstrType    contype;        /* see above */
From c54032ec63334e260fc5075dfd29e97c2e739a45 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, 22 Sep 2022 15:15:20 -0400
Subject: [PATCH v2 4/9] Fix write/read of empty string fields in Nodes.

Historically, outToken has represented both NULL and empty-string
strings as "<>", which readfuncs.c then read as NULL, thus failing
to preserve empty-string fields accurately.  Remarkably, this has
not caused any serious problems yet, but let's fix it.

We'll keep the "<>" notation for NULL, and use """" for empty string,
because that matches other notational choices already in use.
An actual input string of """" is converted to "\""" (this was true
already, apparently as a hangover from an ancient time when string
quoting was handled directly by pg_strtok).

CHAR fields also use "<>", but for '\0'.

diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 24ea0487e7..63dda75ae5 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -135,16 +135,23 @@ static void outChar(StringInfo str, char c);
  *      Convert an ordinary string (eg, an identifier) into a form that
  *      will be decoded back to a plain token by read.c's functions.
  *
- *      If a null or empty string is given, it is encoded as "<>".
+ *      If a null string pointer is given, it is encoded as '<>'.
+ *      An empty string is encoded as '""'.  To avoid ambiguity, input
+ *      strings beginning with '<' or '"' receive a leading backslash.
  */
 void
 outToken(StringInfo str, const char *s)
 {
-    if (s == NULL || *s == '\0')
+    if (s == NULL)
     {
         appendStringInfoString(str, "<>");
         return;
     }
+    if (*s == '\0')
+    {
+        appendStringInfoString(str, "\"\"");
+        return;
+    }
 
     /*
      * Look for characters or patterns that are treated specially by read.c
@@ -178,6 +185,13 @@ outChar(StringInfo str, char c)
 {
     char        in[2];
 
+    /* Traditionally, we've represented \0 as <>, so keep doing that */
+    if (c == '\0')
+    {
+        appendStringInfoString(str, "<>");
+        return;
+    }
+
     in[0] = c;
     in[1] = '\0';
 
@@ -636,7 +650,8 @@ _outString(StringInfo str, const String *node)
 {
     /*
      * We use outToken to provide escaping of the string's content, but we
-     * don't want it to do anything with an empty string.
+     * don't want it to convert an empty string to '""', because we're putting
+     * double quotes around the string already.
      */
     appendStringInfoChar(str, '"');
     if (node->sval[0] != '\0')
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 296104b175..251dff4242 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -178,8 +178,18 @@
 
 #define strtobool(x)  ((*(x) == 't') ? true : false)
 
-#define nullable_string(token,length)  \
-    ((length) == 0 ? NULL : debackslash(token, length))
+static char *
+nullable_string(const char *token, int length)
+{
+    /* outToken emits <> for NULL, and pg_strtok makes that an empty string */
+    if (length == 0)
+        return NULL;
+    /* outToken emits "" for empty string */
+    if (length == 2 && token[0] == '"' && token[1] == '"')
+        return pstrdup("");
+    /* otherwise, we must remove protective backslashes added by outToken */
+    return debackslash(token, length);
+}
 
 
 /*
From 0c39b0f5ad26ae55649a3da12f55938d9adcee2b Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, 22 Sep 2022 15:44:18 -0400
Subject: [PATCH v2 5/9] Don't lose precision for float fields of Nodes.

Historically we've been more worried about making the output of
float fields look pretty than whether they'd be read back exactly.
That won't work if we're to compare the read-back nodes for
equality, so switch to using the Ryu code for float output.

diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index b707a09f56..81b8c184a9 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -983,29 +983,29 @@ _read${n}(void)
         }
         elsif ($t eq 'double')
         {
-            print $off "\tWRITE_FLOAT_FIELD($f, \"%.6f\");\n";
+            print $off "\tWRITE_FLOAT_FIELD($f);\n";
             print $rff "\tREAD_FLOAT_FIELD($f);\n" unless $no_read;
         }
         elsif ($t eq 'Cardinality')
         {
-            print $off "\tWRITE_FLOAT_FIELD($f, \"%.0f\");\n";
+            print $off "\tWRITE_FLOAT_FIELD($f);\n";
             print $rff "\tREAD_FLOAT_FIELD($f);\n" unless $no_read;
         }
         elsif ($t eq 'Cost')
         {
-            print $off "\tWRITE_FLOAT_FIELD($f, \"%.2f\");\n";
+            print $off "\tWRITE_FLOAT_FIELD($f);\n";
             print $rff "\tREAD_FLOAT_FIELD($f);\n" unless $no_read;
         }
         elsif ($t eq 'QualCost')
         {
-            print $off "\tWRITE_FLOAT_FIELD($f.startup, \"%.2f\");\n";
-            print $off "\tWRITE_FLOAT_FIELD($f.per_tuple, \"%.2f\");\n";
+            print $off "\tWRITE_FLOAT_FIELD($f.startup);\n";
+            print $off "\tWRITE_FLOAT_FIELD($f.per_tuple);\n";
             print $rff "\tREAD_FLOAT_FIELD($f.startup);\n"   unless $no_read;
             print $rff "\tREAD_FLOAT_FIELD($f.per_tuple);\n" unless $no_read;
         }
         elsif ($t eq 'Selectivity')
         {
-            print $off "\tWRITE_FLOAT_FIELD($f, \"%.4f\");\n";
+            print $off "\tWRITE_FLOAT_FIELD($f);\n";
             print $rff "\tREAD_FLOAT_FIELD($f);\n" unless $no_read;
         }
         elsif ($t eq 'char*')
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 63dda75ae5..64c65f060b 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -17,6 +17,7 @@
 #include <ctype.h>

 #include "access/attnum.h"
+#include "common/shortest_dec.h"
 #include "lib/stringinfo.h"
 #include "miscadmin.h"
 #include "nodes/bitmapset.h"
@@ -25,6 +26,7 @@
 #include "utils/datum.h"

 static void outChar(StringInfo str, char c);
+static void outDouble(StringInfo str, double d);


 /*
@@ -69,9 +71,10 @@ static void outChar(StringInfo str, char c);
     appendStringInfo(str, " :" CppAsString(fldname) " %d", \
                      (int) node->fldname)

-/* Write a float field --- caller must give format to define precision */
-#define WRITE_FLOAT_FIELD(fldname,format) \
-    appendStringInfo(str, " :" CppAsString(fldname) " " format, node->fldname)
+/* Write a float field (actually, they're double) */
+#define WRITE_FLOAT_FIELD(fldname) \
+    (appendStringInfo(str, " :" CppAsString(fldname) " "), \
+     outDouble(str, node->fldname))

 /* Write a boolean field */
 #define WRITE_BOOL_FIELD(fldname) \
@@ -198,6 +201,18 @@ outChar(StringInfo str, char c)
     outToken(str, in);
 }

+/*
+ * Convert a double value, attempting to ensure the value is preserved exactly.
+ */
+static void
+outDouble(StringInfo str, double d)
+{
+    char        buf[DOUBLE_SHORTEST_DECIMAL_LEN];
+
+    double_to_shortest_decimal_buf(d, buf);
+    appendStringInfoString(str, buf);
+}
+
 /*
  * common implementation for scalar-array-writing functions
  *
@@ -525,7 +540,7 @@ _outRangeTblEntry(StringInfo str, const RangeTblEntry *node)
             break;
         case RTE_NAMEDTUPLESTORE:
             WRITE_STRING_FIELD(enrname);
-            WRITE_FLOAT_FIELD(enrtuples, "%.0f");
+            WRITE_FLOAT_FIELD(enrtuples);
             WRITE_OID_FIELD(relid);
             WRITE_NODE_FIELD(coltypes);
             WRITE_NODE_FIELD(coltypmods);
From 6479e315f700bc65aa8cbe66de07c8fdffd0d8eb Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Wed, 10 Aug 2022 23:37:31 +0200
Subject: [PATCH v2 6/9] XXX: Turn on WRITE_READ_PARSE_PLAN_TREES for testing

Don't commit this one!
---
 src/include/pg_config_manual.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h
index f2a106f983..f85a7a7312 100644
--- a/src/include/pg_config_manual.h
+++ b/src/include/pg_config_manual.h
@@ -334,7 +334,7 @@
  * outfuncs.c/readfuncs.c, to facilitate catching errors and omissions in
  * those modules.
  */
-/* #define WRITE_READ_PARSE_PLAN_TREES */
+#define WRITE_READ_PARSE_PLAN_TREES

 /*
  * Define this to force all raw parse trees for DML statements to be scanned
--
2.37.1

From c9aa70f126477ae5c17a74a5c7b32133dc5f7fb3 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, 22 Sep 2022 15:59:28 -0400
Subject: [PATCH v2 7/9] Implement WRITE_READ_PARSE_PLAN_TREES for raw parse trees


diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 35eff28bd3..4952d01183 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -603,10 +603,22 @@ pg_parse_query(const char *query_string)
 #endif

     /*
-     * Currently, outfuncs/readfuncs support is missing for many raw parse
-     * tree nodes, so we don't try to implement WRITE_READ_PARSE_PLAN_TREES
-     * here.
+     * Optional debugging check: pass raw parsetrees through
+     * outfuncs/readfuncs
      */
+#ifdef WRITE_READ_PARSE_PLAN_TREES
+    {
+        char       *str = nodeToString(raw_parsetree_list);
+        List       *new_list = stringToNodeWithLocations(str);
+
+        pfree(str);
+        /* This checks both outfuncs/readfuncs and the equal() routines... */
+        if (!equal(new_list, raw_parsetree_list))
+            elog(WARNING, "outfuncs/readfuncs failed to produce an equal raw parse tree");
+        else
+            raw_parsetree_list = new_list;
+    }
+#endif

     TRACE_POSTGRESQL_QUERY_PARSE_DONE(query_string);

From 92a720e9fbeff96e17185858bbb811f654ecc80f Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, 22 Sep 2022 16:05:32 -0400
Subject: [PATCH v2 8/9] Enable WRITE_READ_PARSE_PLAN_TREES of rewritten utility statements

This was previously disabled because we lacked outfuncs/readfuncs
support for most utility statement types.

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 4952d01183..5352d5f4c6 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -801,7 +801,7 @@ pg_rewrite_query(Query *query)
         new_list = copyObject(querytree_list);
         /* This checks both copyObject() and the equal() routines... */
         if (!equal(new_list, querytree_list))
-            elog(WARNING, "copyObject() failed to produce equal parse tree");
+            elog(WARNING, "copyObject() failed to produce an equal rewritten parse tree");
         else
             querytree_list = new_list;
     }
@@ -813,35 +813,25 @@ pg_rewrite_query(Query *query)
         List       *new_list = NIL;
         ListCell   *lc;

-        /*
-         * We currently lack outfuncs/readfuncs support for most utility
-         * statement types, so only attempt to write/read non-utility queries.
-         */
         foreach(lc, querytree_list)
         {
             Query       *query = lfirst_node(Query, lc);
+            char       *str = nodeToString(query);
+            Query       *new_query = stringToNodeWithLocations(str);

-            if (query->commandType != CMD_UTILITY)
-            {
-                char       *str = nodeToString(query);
-                Query       *new_query = stringToNodeWithLocations(str);
-
-                /*
-                 * queryId is not saved in stored rules, but we must preserve
-                 * it here to avoid breaking pg_stat_statements.
-                 */
-                new_query->queryId = query->queryId;
+            /*
+             * queryId is not saved in stored rules, but we must preserve it
+             * here to avoid breaking pg_stat_statements.
+             */
+            new_query->queryId = query->queryId;

-                new_list = lappend(new_list, new_query);
-                pfree(str);
-            }
-            else
-                new_list = lappend(new_list, query);
+            new_list = lappend(new_list, new_query);
+            pfree(str);
         }

         /* This checks both outfuncs/readfuncs and the equal() routines... */
         if (!equal(new_list, querytree_list))
-            elog(WARNING, "outfuncs/readfuncs failed to produce equal parse tree");
+            elog(WARNING, "outfuncs/readfuncs failed to produce an equal rewritten parse tree");
         else
             querytree_list = new_list;
     }

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [RFC] building postgres with meson - v13
Next
From: Justin Pryzby
Date:
Subject: Re: CI and test improvements