Extending outfuncs support to utility statements - Mailing list pgsql-hackers

From Tom Lane
Subject Extending outfuncs support to utility statements
Date
Msg-id 4159834.1657405226@sss.pgh.pa.us
Whole thread Raw
Responses Re: Extending outfuncs support to utility statements
Re: Extending outfuncs support to utility statements
List pgsql-hackers
We've long avoided building I/O support for utility-statement node
types, mainly because it didn't seem worth the trouble to write and
maintain such code by hand.  Now that the automatic node-support-code
generation patch is in, that argument is gone, and it's just a matter
of whether the benefits are worth the backend code bloat.  I can
see two benefits worth considering:

* Seems like having such support would be pretty useful for
debugging.

* The only reason struct Query still needs a handwritten output
function is that special logic is needed to prevent trying to
print the utilityStatement field when it's a utility statement
we lack outfuncs support for.  Now it wouldn't be that hard
to get gen_node_support.pl to replicate that special logic,
and if we stick with the status-quo functionality then I think we
should do that so that we can get rid of the handwritten function.
But the other alternative is to provide outfuncs support for all
utility statements and drop the conditionality.

So I looked into how much code are we talking about.  On my
RHEL8 x86_64 machine, the code sizes for outfuncs/readfuncs
as of HEAD are

$ size outfuncs.o readfuncs.o
   text    data     bss     dec     hex filename
 117173       0       0  117173   1c9b5 outfuncs.o
  64540       0       0   64540    fc1c readfuncs.o

If we just open the floodgates and enable both outfuncs and
readfuncs support for all *Stmt nodes (plus some node types
that thereby become dumpable, like AlterTableCmd), then
this becomes

$ size outfuncs.o readfuncs.o
   text    data     bss     dec     hex filename
 139503       0       0  139503   220ef outfuncs.o
  95562       0       0   95562   1754a readfuncs.o

For my taste, the circa 20K growth in outfuncs.o is an okay
price for being able to inspect utility statements more easily.
However, I'm less thrilled with the 30K growth in readfuncs.o,
because I can't see that we'd get any direct benefit from that.
So I think a realistic proposal is to enable outfuncs support
but keep readfuncs disabled.  The attached WIP patch does that,
and gives me these code sizes:

$ size outfuncs.o readfuncs.o
   text    data     bss     dec     hex filename
 139503       0       0  139503   220ef outfuncs.o
  69356       0       0   69356   10eec readfuncs.o

(The extra readfuncs space comes from not troubling over the
subsidiary node types such as AlterTableCmd.  We could run
around and mark those no_read, but I didn't bother yet.)

The support-suppression code in gen_node_support.pl was a crude
hack before, and this patch doesn't make it any less so.
If we go this way, it would be better to move the knowledge that
we're suppressing read functionality into the utility statement
node declarations.  We could just manually label them all
pg_node_attr(no_read), but what I'm kind of tempted to do is
invent a dummy abstract node type like Expr, and make all the
utility statements inherit from it:

typedef struct UtilityStmt
{
    pg_node_attr(abstract, no_read)

    NodeTag        type;
} UtilityStmt;

Thoughts?

            regards, tom lane

diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 4a7902e6bf..fc699c3f19 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -97,21 +97,19 @@ push @scalar_types, qw(QualCost);

 # XXX various things we are not publishing right now to stay level
 # with the manual system
-push @no_copy,  qw(CallContext InlineCodeBlock);
-push @no_equal, qw(CallContext InlineCodeBlock);
-push @no_read_write,
-  qw(AccessPriv AlterTableCmd CallContext CreateOpClassItem FunctionParameter InferClause InlineCodeBlock
ObjectWithArgsOnConflictClause PartitionCmd RoleSpec VacuumRelation); 
-push @no_read, qw(A_ArrayExpr A_Indices A_Indirection AlterStatsStmt
-  CollateClause ColumnDef ColumnRef CreateForeignTableStmt CreateStatsStmt
-  CreateStmt FuncCall ImportForeignSchemaStmt IndexElem IndexStmt
+push @no_copy,       qw(CallContext InlineCodeBlock);
+push @no_equal,      qw(CallContext InlineCodeBlock);
+push @no_read_write, qw(CallContext InlineCodeBlock);
+push @no_read,       qw(A_ArrayExpr A_Indices A_Indirection
+  CollateClause ColumnDef ColumnRef FuncCall IndexElem
   JsonAggConstructor JsonArgument JsonArrayAgg JsonArrayConstructor
   JsonArrayQueryConstructor JsonCommon JsonFuncExpr JsonKeyValue
   JsonObjectAgg JsonObjectConstructor JsonOutput JsonParseExpr JsonScalarExpr
   JsonSerializeExpr JsonTable JsonTableColumn JsonTablePlan LockingClause
-  MultiAssignRef PLAssignStmt ParamRef PartitionElem PartitionSpec
+  MultiAssignRef ParamRef PartitionElem PartitionSpec
   PlaceHolderVar PublicationObjSpec PublicationTable RangeFunction
-  RangeSubselect RangeTableFunc RangeTableFuncCol RangeTableSample RawStmt
-  ResTarget ReturnStmt SelectStmt SortBy StatsElem TableLikeClause
+  RangeSubselect RangeTableFunc RangeTableFuncCol RangeTableSample
+  ResTarget SortBy StatsElem TableLikeClause
   TriggerTransition TypeCast TypeName WindowDef WithClause XmlSerialize);


@@ -693,16 +691,16 @@ foreach my $n (@node_types)
     next if elem $n, @abstract_types;
     next if elem $n, @no_read_write;

-    # XXX For now, skip all "Stmt"s except that ones that were there before.
+    my $no_read = (elem $n, @no_read);
+
+    # Disable read support for utility statements.
+    # XXX this is a very ugly way to do it.
     if ($n =~ /Stmt$/)
     {
-        my @keep =
-          qw(AlterStatsStmt CreateForeignTableStmt CreateStatsStmt CreateStmt DeclareCursorStmt
ImportForeignSchemaStmtIndexStmt NotifyStmt PlannedStmt PLAssignStmt RawStmt ReturnStmt SelectStmt SetOperationStmt); 
-        next unless elem $n, @keep;
+        my @keep = qw(NotifyStmt PlannedStmt SetOperationStmt);
+        $no_read = 1 unless elem $n, @keep;
     }

-    my $no_read = (elem $n, @no_read);
-
     # output format starts with upper case node type name
     my $N = uc $n;

@@ -725,13 +723,20 @@ _out${n}(StringInfo str, const $n *node)

 ";

-    print $rff "
+    if (!$no_read)
+    {
+        my $macro =
+          (@{ $node_type_info{$n}->{fields} } > 0)
+          ? 'READ_LOCALS'
+          : 'READ_LOCALS_NO_FIELDS';
+        print $rff "
 static $n *
 _read${n}(void)
 {
-\tREAD_LOCALS($n);
+\t$macro($n);

-" unless $no_read;
+";
+    }

     # print instructions for each field
     foreach my $f (@{ $node_type_info{$n}->{fields} })
@@ -781,6 +786,7 @@ _read${n}(void)
             print $rff "\tREAD_LOCATION_FIELD($f);\n" unless $no_read;
         }
         elsif ($t eq 'int'
+            || $t eq 'int16'
             || $t eq 'int32'
             || $t eq 'AttrNumber'
             || $t eq 'StrategyNumber')
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 4d776e7b51..85ac0b5e21 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -415,79 +415,6 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
     methods->nodeOut(str, node);
 }

-static void
-_outQuery(StringInfo str, const Query *node)
-{
-    WRITE_NODE_TYPE("QUERY");
-
-    WRITE_ENUM_FIELD(commandType, CmdType);
-    WRITE_ENUM_FIELD(querySource, QuerySource);
-    /* we intentionally do not print the queryId field */
-    WRITE_BOOL_FIELD(canSetTag);
-
-    /*
-     * Hack to work around missing outfuncs routines for a lot of the
-     * utility-statement node types.  (The only one we actually *need* for
-     * rules support is NotifyStmt.)  Someday we ought to support 'em all, but
-     * for the meantime do this to avoid getting lots of warnings when running
-     * with debug_print_parse on.
-     */
-    if (node->utilityStmt)
-    {
-        switch (nodeTag(node->utilityStmt))
-        {
-            case T_CreateStmt:
-            case T_IndexStmt:
-            case T_NotifyStmt:
-            case T_DeclareCursorStmt:
-                WRITE_NODE_FIELD(utilityStmt);
-                break;
-            default:
-                appendStringInfoString(str, " :utilityStmt ?");
-                break;
-        }
-    }
-    else
-        appendStringInfoString(str, " :utilityStmt <>");
-
-    WRITE_INT_FIELD(resultRelation);
-    WRITE_BOOL_FIELD(hasAggs);
-    WRITE_BOOL_FIELD(hasWindowFuncs);
-    WRITE_BOOL_FIELD(hasTargetSRFs);
-    WRITE_BOOL_FIELD(hasSubLinks);
-    WRITE_BOOL_FIELD(hasDistinctOn);
-    WRITE_BOOL_FIELD(hasRecursive);
-    WRITE_BOOL_FIELD(hasModifyingCTE);
-    WRITE_BOOL_FIELD(hasForUpdate);
-    WRITE_BOOL_FIELD(hasRowSecurity);
-    WRITE_BOOL_FIELD(isReturn);
-    WRITE_NODE_FIELD(cteList);
-    WRITE_NODE_FIELD(rtable);
-    WRITE_NODE_FIELD(jointree);
-    WRITE_NODE_FIELD(targetList);
-    WRITE_ENUM_FIELD(override, OverridingKind);
-    WRITE_NODE_FIELD(onConflict);
-    WRITE_NODE_FIELD(returningList);
-    WRITE_NODE_FIELD(groupClause);
-    WRITE_BOOL_FIELD(groupDistinct);
-    WRITE_NODE_FIELD(groupingSets);
-    WRITE_NODE_FIELD(havingQual);
-    WRITE_NODE_FIELD(windowClause);
-    WRITE_NODE_FIELD(distinctClause);
-    WRITE_NODE_FIELD(sortClause);
-    WRITE_NODE_FIELD(limitOffset);
-    WRITE_NODE_FIELD(limitCount);
-    WRITE_ENUM_FIELD(limitOption, LimitOption);
-    WRITE_NODE_FIELD(rowMarks);
-    WRITE_NODE_FIELD(setOperations);
-    WRITE_NODE_FIELD(constraintDeps);
-    WRITE_NODE_FIELD(withCheckOptions);
-    WRITE_NODE_FIELD(mergeActionList);
-    WRITE_BOOL_FIELD(mergeUseOuterJoin);
-    WRITE_LOCATION_FIELD(stmt_location);
-    WRITE_INT_FIELD(stmt_len);
-}
-
 static void
 _outRangeTblEntry(StringInfo str, const RangeTblEntry *node)
 {
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 1421686938..a2391280be 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -240,56 +240,6 @@ readBitmapset(void)
  * special_read_write attribute
  */

-static Query *
-_readQuery(void)
-{
-    READ_LOCALS(Query);
-
-    READ_ENUM_FIELD(commandType, CmdType);
-    READ_ENUM_FIELD(querySource, QuerySource);
-    local_node->queryId = UINT64CONST(0);    /* not saved in output format */
-    READ_BOOL_FIELD(canSetTag);
-    READ_NODE_FIELD(utilityStmt);
-    READ_INT_FIELD(resultRelation);
-    READ_BOOL_FIELD(hasAggs);
-    READ_BOOL_FIELD(hasWindowFuncs);
-    READ_BOOL_FIELD(hasTargetSRFs);
-    READ_BOOL_FIELD(hasSubLinks);
-    READ_BOOL_FIELD(hasDistinctOn);
-    READ_BOOL_FIELD(hasRecursive);
-    READ_BOOL_FIELD(hasModifyingCTE);
-    READ_BOOL_FIELD(hasForUpdate);
-    READ_BOOL_FIELD(hasRowSecurity);
-    READ_BOOL_FIELD(isReturn);
-    READ_NODE_FIELD(cteList);
-    READ_NODE_FIELD(rtable);
-    READ_NODE_FIELD(jointree);
-    READ_NODE_FIELD(targetList);
-    READ_ENUM_FIELD(override, OverridingKind);
-    READ_NODE_FIELD(onConflict);
-    READ_NODE_FIELD(returningList);
-    READ_NODE_FIELD(groupClause);
-    READ_BOOL_FIELD(groupDistinct);
-    READ_NODE_FIELD(groupingSets);
-    READ_NODE_FIELD(havingQual);
-    READ_NODE_FIELD(windowClause);
-    READ_NODE_FIELD(distinctClause);
-    READ_NODE_FIELD(sortClause);
-    READ_NODE_FIELD(limitOffset);
-    READ_NODE_FIELD(limitCount);
-    READ_ENUM_FIELD(limitOption, LimitOption);
-    READ_NODE_FIELD(rowMarks);
-    READ_NODE_FIELD(setOperations);
-    READ_NODE_FIELD(constraintDeps);
-    READ_NODE_FIELD(withCheckOptions);
-    READ_NODE_FIELD(mergeActionList);
-    READ_BOOL_FIELD(mergeUseOuterJoin);
-    READ_LOCATION_FIELD(stmt_location);
-    READ_INT_FIELD(stmt_len);
-
-    READ_DONE();
-}
-
 static Const *
 _readConst(void)
 {
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 0b6a7bb365..617ba619de 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -117,8 +117,6 @@ typedef uint32 AclMode;            /* a bitmask of privilege bits */
  */
 typedef struct Query
 {
-    pg_node_attr(custom_read_write)
-
     NodeTag        type;

     CmdType        commandType;    /* select|insert|update|delete|merge|utility */
@@ -126,10 +124,10 @@ typedef struct Query
     QuerySource querySource;    /* where did I come from? */

     /*
-     * query identifier (can be set by plugins); ignored for equal, might not
-     * be set
+     * query identifier (can be set by plugins); ignored for equal, as it
+     * might not be set; also not stored
      */
-    uint64        queryId pg_node_attr(equal_ignore, read_as(0));
+    uint64        queryId pg_node_attr(equal_ignore, read_write_ignore, read_as(0));

     bool        canSetTag;        /* do I set the command result tag? */


pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Compilation issue on Solaris.
Next
From: Tom Lane
Date:
Subject: Making CallContext and InlineCodeBlock less special-case-y