Thread: BUG #18468: CREATE TABLE ... LIKE leaves orphaned column reference in extended statistics

The following bug has been logged on the website:

Bug reference:      18468
Logged by:          Alexander Lakhin
Email address:      exclusion@gmail.com
PostgreSQL version: 16.3
Operating system:   Ubuntu 22.04
Description:

The following script:
CREATE TABLE t0 (a int, c text, b text);
CREATE STATISTICS ext_stat ON (a || b) FROM t0;
ALTER TABLE t0 DROP COLUMN c;
CREATE TABLE t1 (LIKE t0 INCLUDING ALL);

produces an invalid statistics definition:
\d+ t1
ERROR:  invalid attnum 3 for relation "t1"

SELECT stxname, stxexprs FROM pg_statistic_ext;
 ext_stat     | ({OPEXPR :opno 2780 ... {VAR :varno 1 :varattno 3 ...
 t1_expr_stat | ({OPEXPR :opno 2780 ... {VAR :varno 1 :varattno 3 ...

Though an index definition transferred correctly:
CREATE TABLE t0 (a int, c text, b text);
CREATE INDEX t_idx ON t0((a || b));
ALTER TABLE t0 DROP COLUMN c;
CREATE TABLE t1 (LIKE t0 INCLUDING ALL);
\d+ t1
...
Indexes:
    "t1_expr_idx" btree ((a || b))

SELECT relname, indexprs FROM pg_index, pg_class WHERE indexrelid = oid
  AND relname = 't1_expr_idx';
 t1_expr_idx | ({OPEXPR :opno 2780 ... {VAR :varno 1 :varattno 2 ...




PG Bug reporting form <noreply@postgresql.org> 于2024年5月16日周四 05:41写道:
The following bug has been logged on the website:

Bug reference:      18468
Logged by:          Alexander Lakhin
Email address:      exclusion@gmail.com
PostgreSQL version: 16.3
Operating system:   Ubuntu 22.04
Description:       

The following script:
CREATE TABLE t0 (a int, c text, b text);
CREATE STATISTICS ext_stat ON (a || b) FROM t0;
ALTER TABLE t0 DROP COLUMN c;
CREATE TABLE t1 (LIKE t0 INCLUDING ALL);

produces an invalid statistics definition:
\d+ t1
ERROR:  invalid attnum 3 for relation "t1"

SELECT stxname, stxexprs FROM pg_statistic_ext;
 ext_stat     | ({OPEXPR :opno 2780 ... {VAR :varno 1 :varattno 3 ...
 t1_expr_stat | ({OPEXPR :opno 2780 ... {VAR :varno 1 :varattno 3 ...

Though an index definition transferred correctly:
CREATE TABLE t0 (a int, c text, b text);
CREATE INDEX t_idx ON t0((a || b));
ALTER TABLE t0 DROP COLUMN c;
CREATE TABLE t1 (LIKE t0 INCLUDING ALL);
\d+ t1
...
Indexes:
    "t1_expr_idx" btree ((a || b))

SELECT relname, indexprs FROM pg_index, pg_class WHERE indexrelid = oid
  AND relname = 't1_expr_idx';
 t1_expr_idx | ({OPEXPR :opno 2780 ... {VAR :varno 1 :varattno 2 ...


Thanks for reporting. I can reproduce on HEAD.
I found that since this commit a4d75c8,  we support statistics on expressions.
Before a4d75c8, CREATE STATISTICS ext_stat ON (a || b) FROM t0;
will report error.

I will continue to look into this issue.
--
Tender Wang
OpenPie:  https://en.openpie.com/
generateClonedExtStatsStmt just copy t0's ext_stat info to t1.
I think we could do something in transformExtendedStatistics() as the comments said.
in transformCreateStmt, has below code:
/*
* Postprocess extended statistics.
*/
transformExtendedStatistics(&cxt);

Right now, there is nothing to do in transformExtendedStatistics. 
Can we update the varattno to the right value here?
--
Tender Wang
OpenPie:  https://en.openpie.com/
Tender Wang <tndrwang@gmail.com> writes:
> generateClonedExtStatsStmt just copy t0's ext_stat info to t1.

So this is a complete mess.  The fundamental problem is that
transformTableLikeClause believes it can process LIKE_STATISTICS
requests immediately, because:

     * We may copy extended statistics if requested, since the representation
     * of CreateStatsStmt doesn't depend on column numbers.

That was true apparently in the original, limited implementation
where we only supported CREATE STATISTICS on plain columns.
It's completely wrong for statistics on expressions.  IMO what we
need to do is make LIKE_STATISTICS work more like LIKE_INDEXES:
postpone the transformation until expandTableLikeClause when we know
the mapping from source columns to destination columns, and have a
chance at converting the expression trees properly, comparably to
generateClonedIndexStmt.

> Right now, there is nothing to do in transformExtendedStatistics.
> Can we update the varattno to the right value here?

We don't know the column mapping there, either.  What we need to have
our hands on is the "attmap" computed in expandTableLikeClause, and
then we can pass the stats expressions through map_variable_attnos().

I think this might not need to be a really large patch, but it
definitely has to change where generateClonedExtStatsStmt is
called from.  I can have a go at it if Tomas doesn't want to.

            regards, tom lane





Tom Lane <tgl@sss.pgh.pa.us> 于2024年5月17日周五 02:45写道:
Tender Wang <tndrwang@gmail.com> writes:
> generateClonedExtStatsStmt just copy t0's ext_stat info to t1.

So this is a complete mess.  The fundamental problem is that
transformTableLikeClause believes it can process LIKE_STATISTICS
requests immediately, because:

     * We may copy extended statistics if requested, since the representation
     * of CreateStatsStmt doesn't depend on column numbers.

The above comments in  transformTableLikeClause and below comments in 
generateClonedExtStatsStmt:
   * Now handle expressions, if there are any. The order (with respect to
   * regular attributes) does not really matter for extended stats, so we
   * simply append them after simple column references.
now should do some revise. 

That was true apparently in the original, limited implementation
where we only supported CREATE STATISTICS on plain columns.
It's completely wrong for statistics on expressions.  IMO what we
need to do is make LIKE_STATISTICS work more like LIKE_INDEXES:
postpone the transformation until expandTableLikeClause when we know
the mapping from source columns to destination columns, and have a
chance at converting the expression trees properly, comparably to
generateClonedIndexStmt.
> Right now, there is nothing to do in transformExtendedStatistics.
> Can we update the varattno to the right value here?

We don't know the column mapping there, either.  What we need to have
our hands on is the "attmap" computed in expandTableLikeClause, and
then we can pass the stats expressions through map_variable_attnos().
 
This fix looks better than what I said in  transformExtendedStatistics.

 
I think this might not need to be a really large patch, but it
definitely has to change where generateClonedExtStatsStmt is
called from.  I can have a go at it if Tomas doesn't want to.

                        regards, tom lane


--
Tender Wang
OpenPie:  https://en.openpie.com/
I wrote:
> We don't know the column mapping there, either.  What we need to have
> our hands on is the "attmap" computed in expandTableLikeClause, and
> then we can pass the stats expressions through map_variable_attnos().

> I think this might not need to be a really large patch, but it
> definitely has to change where generateClonedExtStatsStmt is
> called from.  I can have a go at it if Tomas doesn't want to.

Yeah, this doesn't require a whole lot more than moving the
processing in transformTableLikeClause to expandTableLikeClause.

I chose to get rid of transformExtendedStatistics, because not
only is there nothing for it to do, there will never be any
statements for it to do nothing to.

I plan to sit on this till after the beta1 freeze.

            regards, tom lane

diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 639cfa443e..d5c2b2ff0b 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -87,7 +87,6 @@ typedef struct
     List       *fkconstraints;    /* FOREIGN KEY constraints */
     List       *ixconstraints;    /* index-creating constraints */
     List       *likeclauses;    /* LIKE clauses that need post-processing */
-    List       *extstats;        /* cloned extended statistics */
     List       *blist;            /* "before list" of things to do before
                                  * creating the table */
     List       *alist;            /* "after list" of things to do after creating
@@ -120,13 +119,14 @@ static void transformTableLikeClause(CreateStmtContext *cxt,
 static void transformOfType(CreateStmtContext *cxt,
                             TypeName *ofTypename);
 static CreateStatsStmt *generateClonedExtStatsStmt(RangeVar *heapRel,
-                                                   Oid heapRelid, Oid source_statsid);
+                                                   Oid heapRelid,
+                                                   Oid source_statsid,
+                                                   const AttrMap *attmap);
 static List *get_collation(Oid collation, Oid actual_datatype);
 static List *get_opclass(Oid opclass, Oid actual_datatype);
 static void transformIndexConstraints(CreateStmtContext *cxt);
 static IndexStmt *transformIndexConstraint(Constraint *constraint,
                                            CreateStmtContext *cxt);
-static void transformExtendedStatistics(CreateStmtContext *cxt);
 static void transformFKConstraints(CreateStmtContext *cxt,
                                    bool skipValidation,
                                    bool isAddConstraint);
@@ -246,7 +246,6 @@ transformCreateStmt(CreateStmt *stmt, const char *queryString)
     cxt.fkconstraints = NIL;
     cxt.ixconstraints = NIL;
     cxt.likeclauses = NIL;
-    cxt.extstats = NIL;
     cxt.blist = NIL;
     cxt.alist = NIL;
     cxt.pkey = NULL;
@@ -339,11 +338,6 @@ transformCreateStmt(CreateStmt *stmt, const char *queryString)
      */
     transformCheckConstraints(&cxt, !cxt.isforeign);

-    /*
-     * Postprocess extended statistics.
-     */
-    transformExtendedStatistics(&cxt);
-
     /*
      * Output results.
      */
@@ -1111,61 +1105,25 @@ transformTableLikeClause(CreateStmtContext *cxt, TableLikeClause *table_like_cla
     }

     /*
-     * We cannot yet deal with defaults, CHECK constraints, or indexes, since
-     * we don't yet know what column numbers the copied columns will have in
-     * the finished table.  If any of those options are specified, add the
-     * LIKE clause to cxt->likeclauses so that expandTableLikeClause will be
-     * called after we do know that.  Also, remember the relation OID so that
-     * expandTableLikeClause is certain to open the same table.
+     * We cannot yet deal with defaults, CHECK constraints, indexes, or
+     * statistics, since we don't yet know what column numbers the copied
+     * columns will have in the finished table.  If any of those options are
+     * specified, add the LIKE clause to cxt->likeclauses so that
+     * expandTableLikeClause will be called after we do know that.  Also,
+     * remember the relation OID so that expandTableLikeClause is certain to
+     * open the same table.
      */
     if (table_like_clause->options &
         (CREATE_TABLE_LIKE_DEFAULTS |
          CREATE_TABLE_LIKE_GENERATED |
          CREATE_TABLE_LIKE_CONSTRAINTS |
-         CREATE_TABLE_LIKE_INDEXES))
+         CREATE_TABLE_LIKE_INDEXES |
+         CREATE_TABLE_LIKE_STATISTICS))
     {
         table_like_clause->relationOid = RelationGetRelid(relation);
         cxt->likeclauses = lappend(cxt->likeclauses, table_like_clause);
     }

-    /*
-     * We may copy extended statistics if requested, since the representation
-     * of CreateStatsStmt doesn't depend on column numbers.
-     */
-    if (table_like_clause->options & CREATE_TABLE_LIKE_STATISTICS)
-    {
-        List       *parent_extstats;
-        ListCell   *l;
-
-        parent_extstats = RelationGetStatExtList(relation);
-
-        foreach(l, parent_extstats)
-        {
-            Oid            parent_stat_oid = lfirst_oid(l);
-            CreateStatsStmt *stats_stmt;
-
-            stats_stmt = generateClonedExtStatsStmt(cxt->relation,
-                                                    RelationGetRelid(relation),
-                                                    parent_stat_oid);
-
-            /* Copy comment on statistics object, if requested */
-            if (table_like_clause->options & CREATE_TABLE_LIKE_COMMENTS)
-            {
-                comment = GetComment(parent_stat_oid, StatisticExtRelationId, 0);
-
-                /*
-                 * We make use of CreateStatsStmt's stxcomment option, so as
-                 * not to need to know now what name the statistics will have.
-                 */
-                stats_stmt->stxcomment = comment;
-            }
-
-            cxt->extstats = lappend(cxt->extstats, stats_stmt);
-        }
-
-        list_free(parent_extstats);
-    }
-
     /*
      * Close the parent rel, but keep our AccessShareLock on it until xact
      * commit.  That will prevent someone else from deleting or ALTERing the
@@ -1423,6 +1381,44 @@ expandTableLikeClause(RangeVar *heapRel, TableLikeClause *table_like_clause)
         }
     }

+    /*
+     * Process extended statistics if required.
+     */
+    if (table_like_clause->options & CREATE_TABLE_LIKE_STATISTICS)
+    {
+        List       *parent_extstats;
+        ListCell   *l;
+
+        parent_extstats = RelationGetStatExtList(relation);
+
+        foreach(l, parent_extstats)
+        {
+            Oid            parent_stat_oid = lfirst_oid(l);
+            CreateStatsStmt *stats_stmt;
+
+            stats_stmt = generateClonedExtStatsStmt(heapRel,
+                                                    RelationGetRelid(childrel),
+                                                    parent_stat_oid,
+                                                    attmap);
+
+            /* Copy comment on statistics object, if requested */
+            if (table_like_clause->options & CREATE_TABLE_LIKE_COMMENTS)
+            {
+                comment = GetComment(parent_stat_oid, StatisticExtRelationId, 0);
+
+                /*
+                 * We make use of CreateStatsStmt's stxcomment option, so as
+                 * not to need to know now what name the statistics will have.
+                 */
+                stats_stmt->stxcomment = comment;
+            }
+
+            result = lappend(result, stats_stmt);
+        }
+
+        list_free(parent_extstats);
+    }
+
     /* Done with child rel */
     table_close(childrel, NoLock);

@@ -1837,10 +1833,12 @@ generateClonedIndexStmt(RangeVar *heapRel, Relation source_idx,
  * Generate a CreateStatsStmt node using information from an already existing
  * extended statistic "source_statsid", for the rel identified by heapRel and
  * heapRelid.
+ *
+ * Attribute numbers in expression Vars are adjusted according to attmap.
  */
 static CreateStatsStmt *
 generateClonedExtStatsStmt(RangeVar *heapRel, Oid heapRelid,
-                           Oid source_statsid)
+                           Oid source_statsid, const AttrMap *attmap)
 {
     HeapTuple    ht_stats;
     Form_pg_statistic_ext statsrec;
@@ -1923,10 +1921,19 @@ generateClonedExtStatsStmt(RangeVar *heapRel, Oid heapRelid,

         foreach(lc, exprs)
         {
+            Node       *expr = (Node *) lfirst(lc);
             StatsElem  *selem = makeNode(StatsElem);
+            bool        found_whole_row;
+
+            /* Adjust Vars to match new table's column numbering */
+            expr = map_variable_attnos(expr,
+                                       1, 0,
+                                       attmap,
+                                       InvalidOid,
+                                       &found_whole_row);

             selem->name = NULL;
-            selem->expr = (Node *) lfirst(lc);
+            selem->expr = expr;

             def_names = lappend(def_names, selem);
         }
@@ -2652,19 +2659,6 @@ transformIndexConstraint(Constraint *constraint, CreateStmtContext *cxt)
     return index;
 }

-/*
- * transformExtendedStatistics
- *     Handle extended statistic objects
- *
- * Right now, there's nothing to do here, so we just append the list to
- * the existing "after" list.
- */
-static void
-transformExtendedStatistics(CreateStmtContext *cxt)
-{
-    cxt->alist = list_concat(cxt->alist, cxt->extstats);
-}
-
 /*
  * transformCheckConstraints
  *        handle CHECK constraints
@@ -3457,7 +3451,6 @@ transformAlterTableStmt(Oid relid, AlterTableStmt *stmt,
     cxt.fkconstraints = NIL;
     cxt.ixconstraints = NIL;
     cxt.likeclauses = NIL;
-    cxt.extstats = NIL;
     cxt.blist = NIL;
     cxt.alist = NIL;
     cxt.pkey = NULL;
@@ -3763,9 +3756,6 @@ transformAlterTableStmt(Oid relid, AlterTableStmt *stmt,
         newcmds = lappend(newcmds, newcmd);
     }

-    /* Append extended statistics objects */
-    transformExtendedStatistics(&cxt);
-
     /* Close rel */
     relation_close(rel, NoLock);

diff --git a/src/test/regress/expected/create_table_like.out b/src/test/regress/expected/create_table_like.out
index 0ed94f1d2f..6bfc6d040f 100644
--- a/src/test/regress/expected/create_table_like.out
+++ b/src/test/regress/expected/create_table_like.out
@@ -261,8 +261,23 @@ Check constraints:
 Inherits: test_like_5,
           test_like_5x

+-- Test updating of column numbers in statistics expressions (bug #18468)
+CREATE TABLE test_like_6 (a int, c text, b text);
+CREATE STATISTICS ext_stat ON (a || b) FROM test_like_6;
+ALTER TABLE test_like_6 DROP COLUMN c;
+CREATE TABLE test_like_6c (LIKE test_like_6 INCLUDING ALL);
+\d+ test_like_6c
+                                Table "public.test_like_6c"
+ Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description
+--------+---------+-----------+----------+---------+----------+--------------+-------------
+ a      | integer |           |          |         | plain    |              |
+ b      | text    |           |          |         | extended |              |
+Statistics objects:
+    "public.test_like_6c_expr_stat" ON (a || b) FROM test_like_6c
+
 DROP TABLE test_like_4, test_like_4a, test_like_4b, test_like_4c, test_like_4d;
 DROP TABLE test_like_5, test_like_5x, test_like_5c;
+DROP TABLE test_like_6, test_like_6c;
 CREATE TABLE inhg (x text, LIKE inhx INCLUDING INDEXES, y text); /* copies indexes */
 INSERT INTO inhg VALUES (5, 10);
 INSERT INTO inhg VALUES (20, 10); -- should fail
diff --git a/src/test/regress/sql/create_table_like.sql b/src/test/regress/sql/create_table_like.sql
index 4929d373a2..04008a027b 100644
--- a/src/test/regress/sql/create_table_like.sql
+++ b/src/test/regress/sql/create_table_like.sql
@@ -95,8 +95,16 @@ CREATE TABLE test_like_5c (LIKE test_like_4 INCLUDING ALL)
   INHERITS (test_like_5, test_like_5x);
 \d test_like_5c

+-- Test updating of column numbers in statistics expressions (bug #18468)
+CREATE TABLE test_like_6 (a int, c text, b text);
+CREATE STATISTICS ext_stat ON (a || b) FROM test_like_6;
+ALTER TABLE test_like_6 DROP COLUMN c;
+CREATE TABLE test_like_6c (LIKE test_like_6 INCLUDING ALL);
+\d+ test_like_6c
+
 DROP TABLE test_like_4, test_like_4a, test_like_4b, test_like_4c, test_like_4d;
 DROP TABLE test_like_5, test_like_5x, test_like_5c;
+DROP TABLE test_like_6, test_like_6c;

 CREATE TABLE inhg (x text, LIKE inhx INCLUDING INDEXES, y text); /* copies indexes */
 INSERT INTO inhg VALUES (5, 10);



Tom Lane <tgl@sss.pgh.pa.us> 于2024年5月18日周六 04:39写道:
I wrote:
> We don't know the column mapping there, either.  What we need to have
> our hands on is the "attmap" computed in expandTableLikeClause, and
> then we can pass the stats expressions through map_variable_attnos().

> I think this might not need to be a really large patch, but it
> definitely has to change where generateClonedExtStatsStmt is
> called from.  I can have a go at it if Tomas doesn't want to.

Yeah, this doesn't require a whole lot more than moving the
processing in transformTableLikeClause to expandTableLikeClause.

I chose to get rid of transformExtendedStatistics, because not
only is there nothing for it to do, there will never be any
statements for it to do nothing to.
 
The patch looks good to me.

--
Tender Wang
OpenPie:  https://en.openpie.com/