Thread: Re: ANALYZE versus expression indexes with nondefault opckeytype

Re: ANALYZE versus expression indexes with nondefault opckeytype

From

"Kevin Grittner"

Date:

31 July 2010, 19:27:14

Robert Haas  07/31/10 12:33 PM >>>
> Tom Lane  wrote:
>> Robert Haas  writes:
>>> I think this whole discussion is starting with the wrong premise.
>>> This is not a bug fix; therefore, it's 9.1 material.
>>
>> Failing to store stats isn't a bug?
> 
> Well, it kind of sounds more like you're removing a known
> limitation than fixing a bug.
It's operating as designed and documented.  There is room for
enhancement, but the only thing which could possibly justify this as
9.0 material is if there was a demonstrated performance regression in
9.0 for which this was the safest cure.
-Kevin

Re: ANALYZE versus expression indexes with nondefault opckeytype

From

Stephen Frost

Date:

01 August 2010, 01:17:04

* Kevin Grittner (Kevin.Grittner@wicourts.gov) wrote:
> Robert Haas  07/31/10 12:33 PM >>>
> > Tom Lane  wrote:
> >> Failing to store stats isn't a bug?
> >
> > Well, it kind of sounds more like you're removing a known
> > limitation than fixing a bug.
>
> It's operating as designed and documented.  There is room for
> enhancement, but the only thing which could possibly justify this as
> 9.0 material is if there was a demonstrated performance regression in
> 9.0 for which this was the safest cure.

I have to disagree with this, to be honest.  The fact that we've
documented what is completely unexpected and frustrating behaviour
doesn't mean we get to say it's not a bug.  Not collecting stats, at
all, is a pretty bad bug, in my view.  Stats are an important part of
the system which needs to work at least decently.  Perhaps before it was
pretty rare that we'd have the situation described (before we brought in
tsearch2), but it's not any longer and we need to support it as we would
the other types.  The only reason I'm against backpatching it to the
beginning is that it's either an ABI change or some rather grotty code,
and even then it wouldn't be hard to push me to accepting the grotty
code if we make the cleaner change for 9.0 and going forward, especially
as we have people in the wild being affected by it.

Certain other databases have done a very good job of documenting their
bugs and in some cases even calling them features.  I'd rather we not go
down that path.  I don't see the lack of stats collecting to be a simple
'limitation'.
Thanks,
    Stephen

Re: ANALYZE versus expression indexes with nondefault opckeytype

From

Robert Haas

Date:

01 August 2010, 01:48:15

On Sat, Jul 31, 2010 at 9:16 PM, Stephen Frost <sfrost@snowman.net> wrote:
> * Kevin Grittner (Kevin.Grittner@wicourts.gov) wrote:
>> Robert Haas  07/31/10 12:33 PM >>>
>> > Tom Lane  wrote:
>> >> Failing to store stats isn't a bug?
>> >
>> > Well, it kind of sounds more like you're removing a known
>> > limitation than fixing a bug.
>>
>> It's operating as designed and documented.  There is room for
>> enhancement, but the only thing which could possibly justify this as
>> 9.0 material is if there was a demonstrated performance regression in
>> 9.0 for which this was the safest cure.
>
> I have to disagree with this, to be honest.  The fact that we've
> documented what is completely unexpected and frustrating behaviour
> doesn't mean we get to say it's not a bug.  Not collecting stats, at
> all, is a pretty bad bug, in my view.

I guess I'd appreciate it if someone could explain in more detail in
what cases we fail to collect stats.  Do we have a typanalyze function
here that can't possibly work for anything, ever?  Or is it just some
subset of the cases?

(Apologies if this has been discussed on the original thread; I was
unable to find it in the archives.)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: ANALYZE versus expression indexes with nondefault opckeytype

From

Tom Lane

Date:

01 August 2010, 03:15:54

Robert Haas <robertmhaas@gmail.com> writes:
> On Sat, Jul 31, 2010 at 9:16 PM, Stephen Frost <sfrost@snowman.net> wrote:
>> * Kevin Grittner (Kevin.Grittner@wicourts.gov) wrote:
>>> Robert Haas �07/31/10 12:33 PM >>>
>>>> Tom Lane �wrote:
>>>>> Failing to store stats isn't a bug?

>>>> Well, it kind of sounds more like you're removing a known
>>>> limitation than fixing a bug.

>>> It's operating as designed and documented.

>> I have to disagree with this, to be honest. �The fact that we've
>> documented what is completely unexpected and frustrating behaviour
>> doesn't mean we get to say it's not a bug. �Not collecting stats, at
>> all, is a pretty bad bug, in my view.

I'm a bit bemused by the claim that this behavior is "documented".  One
comment buried deep in the bowels of the source is not user-visible
documentation in my book.

> I guess I'd appreciate it if someone could explain in more detail in
> what cases we fail to collect stats.  Do we have a typanalyze function
> here that can't possibly work for anything, ever?  Or is it just some
> subset of the cases?

ANALYZE normally collects stats for any expression that there is an
expression index for.  However, it will punt and fail to collect stats
if the expression index uses an opclass whose opckeytype (ie, storage
datatype) is different from the actual expression datatype.  A quick
look into the system catalogs shows that that applies to these opclasses:
amname |     opcname      |           opcintype           |         opckeytype          
--------+------------------+-------------------------------+-----------------------------btree  | name_ops         |
name                         | cstringgist   | point_ops        | point                         | boxgist   | poly_ops
      | polygon                       | boxgist   | circle_ops       | circle                        | boxgin    |
_int4_ops       | integer[]                     | integergin    | _text_ops        | text[]                        |
textgin   | _abstime_ops     | abstime[]                     | abstimegin    | _bit_ops         | bit[]
       | bitgin    | _bool_ops        | boolean[]                     | booleangin    | _bpchar_ops      | character[]
                | charactergin    | _bytea_ops       | bytea[]                       | byteagin    | _char_ops        |
"char"[]                     | "char"gin    | _cidr_ops        | cidr[]                        | cidrgin    | _date_ops
      | date[]                        | dategin    | _float4_ops      | real[]                        | realgin    |
_float8_ops     | double precision[]            | double precisiongin    | _inet_ops        | inet[]
   | inetgin    | _int2_ops        | smallint[]                    | smallintgin    | _int8_ops        | bigint[]
              | bigintgin    | _interval_ops    | interval[]                    | intervalgin    | _macaddr_ops     |
macaddr[]                    | macaddrgin    | _name_ops        | name[]                        | namegin    |
_numeric_ops    | numeric[]                     | numericgin    | _oid_ops         | oid[]                         |
oidgin   | _oidvector_ops   | oidvector[]                   | oidvectorgin    | _time_ops        | time without time
zone[]     | time without time zonegin    | _timestamptz_ops | timestamp with time zone[]    | timestamp with time
zonegin   | _timetz_ops      | time with time zone[]         | time with time zonegin    | _varbit_ops      | bit
varying[]                | bit varyinggin    | _varchar_ops     | character varying[]           | character varyinggin
 | _timestamp_ops   | timestamp without time zone[] | timestamp without time zonegin    | _money_ops       | money[]
                  | moneygin    | _reltime_ops     | reltime[]                     | reltimegin    | _tinterval_ops   |
tinterval[]                  | tintervalgist   | tsvector_ops     | tsvector                      | gtsvectorgin    |
tsvector_ops    | tsvector                      | textgist   | tsquery_ops      | tsquery                       |
bigint
(37 rows)

Now, of the above the only cases where we'd be likely to be able to do
anything very useful with stats on the expression value are the name
case, which isn't that exciting in practice, and the tsvector cases.
For tsvector it was only with 8.4 that we had non-toy stats code, so
while the limitation is ancient it's only recently that it started to be
meaningful.

I don't think this can be claimed to be a corner case.  If you set up
an FTS index according to the first alternative offered in

http://developer.postgresql.org/pgdocs/postgres/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX

you will find that the system fails to collect stats for it and so you
get stupid default estimates for your FTS queries.  If this were a
"documented" limitation I'd expect to see a big red warning there to
*not* do it that way.  The only way that you actually get usable
tsvector stats at the moment is to explicitly store the tsvector as an
ordinary column, as in the second approach offered in the above
documentation section.
        regards, tom lane

Re: ANALYZE versus expression indexes with nondefault opckeytype

From

Robert Haas

Date:

01 August 2010, 12:54:53

On Sat, Jul 31, 2010 at 11:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Now, of the above the only cases where we'd be likely to be able to do
> anything very useful with stats on the expression value are the name
> case, which isn't that exciting in practice, and the tsvector cases.
> For tsvector it was only with 8.4 that we had non-toy stats code, so
> while the limitation is ancient it's only recently that it started to be
> meaningful.
>
> I don't think this can be claimed to be a corner case.  If you set up
> an FTS index according to the first alternative offered in
>
> http://developer.postgresql.org/pgdocs/postgres/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX
>
> you will find that the system fails to collect stats for it and so you
> get stupid default estimates for your FTS queries.  If this were a
> "documented" limitation I'd expect to see a big red warning there to
> *not* do it that way.  The only way that you actually get usable
> tsvector stats at the moment is to explicitly store the tsvector as an
> ordinary column, as in the second approach offered in the above
> documentation section.

Yeah, maybe you're right.  But I'd still prefer to see us break the
ABI and do this just in 9.0 rather than changing 8.4.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: ANALYZE versus expression indexes with nondefault opckeytype

From

Tom Lane

Date:

01 August 2010, 15:54:42

Robert Haas <robertmhaas@gmail.com> writes:
> On Sat, Jul 31, 2010 at 11:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I don't think this can be claimed to be a corner case. �If you set up
>> an FTS index according to the first alternative offered in
>> 
>> http://developer.postgresql.org/pgdocs/postgres/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX
>> 
>> you will find that the system fails to collect stats for it and so you
>> get stupid default estimates for your FTS queries.

> Yeah, maybe you're right.  But I'd still prefer to see us break the
> ABI and do this just in 9.0 rather than changing 8.4.

OK, I can live with that.  I'll take a look at it shortly.
        regards, tom lane

Re: ANALYZE versus expression indexes with nondefault opckeytype

From

Tom Lane

Date:

01 August 2010, 20:03:19

I wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Yeah, maybe you're right.  But I'd still prefer to see us break the
>> ABI and do this just in 9.0 rather than changing 8.4.

> OK, I can live with that.  I'll take a look at it shortly.

Proposed patch attached (compiles, untested as yet).

            regards, tom lane

Index: src/backend/commands/analyze.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/commands/analyze.c,v
retrieving revision 1.152
diff -c -r1.152 analyze.c
*** src/backend/commands/analyze.c    26 Feb 2010 02:00:37 -0000    1.152
--- src/backend/commands/analyze.c    1 Aug 2010 19:56:12 -0000
***************
*** 92,98 ****
                      AnlIndexData *indexdata, int nindexes,
                      HeapTuple *rows, int numrows,
                      MemoryContext col_context);
! static VacAttrStats *examine_attribute(Relation onerel, int attnum);
  static int acquire_sample_rows(Relation onerel, HeapTuple *rows,
                      int targrows, double *totalrows, double *totaldeadrows);
  static double random_fract(void);
--- 92,99 ----
                      AnlIndexData *indexdata, int nindexes,
                      HeapTuple *rows, int numrows,
                      MemoryContext col_context);
! static VacAttrStats *examine_attribute(Relation onerel, int attnum,
!                                        Node *index_expr);
  static int acquire_sample_rows(Relation onerel, HeapTuple *rows,
                      int targrows, double *totalrows, double *totaldeadrows);
  static double random_fract(void);
***************
*** 339,345 ****
                          (errcode(ERRCODE_UNDEFINED_COLUMN),
                      errmsg("column \"%s\" of relation \"%s\" does not exist",
                             col, RelationGetRelationName(onerel))));
!             vacattrstats[tcnt] = examine_attribute(onerel, i);
              if (vacattrstats[tcnt] != NULL)
                  tcnt++;
          }
--- 340,346 ----
                          (errcode(ERRCODE_UNDEFINED_COLUMN),
                      errmsg("column \"%s\" of relation \"%s\" does not exist",
                             col, RelationGetRelationName(onerel))));
!             vacattrstats[tcnt] = examine_attribute(onerel, i, NULL);
              if (vacattrstats[tcnt] != NULL)
                  tcnt++;
          }
***************
*** 353,359 ****
          tcnt = 0;
          for (i = 1; i <= attr_cnt; i++)
          {
!             vacattrstats[tcnt] = examine_attribute(onerel, i);
              if (vacattrstats[tcnt] != NULL)
                  tcnt++;
          }
--- 354,360 ----
          tcnt = 0;
          for (i = 1; i <= attr_cnt; i++)
          {
!             vacattrstats[tcnt] = examine_attribute(onerel, i, NULL);
              if (vacattrstats[tcnt] != NULL)
                  tcnt++;
          }
***************
*** 407,427 ****
                              elog(ERROR, "too few entries in indexprs list");
                          indexkey = (Node *) lfirst(indexpr_item);
                          indexpr_item = lnext(indexpr_item);
-
-                         /*
-                          * Can't analyze if the opclass uses a storage type
-                          * different from the expression result type. We'd get
-                          * confused because the type shown in pg_attribute for
-                          * the index column doesn't match what we are getting
-                          * from the expression. Perhaps this can be fixed
-                          * someday, but for now, punt.
-                          */
-                         if (exprType(indexkey) !=
-                             Irel[ind]->rd_att->attrs[i]->atttypid)
-                             continue;
-
                          thisdata->vacattrstats[tcnt] =
!                             examine_attribute(Irel[ind], i + 1);
                          if (thisdata->vacattrstats[tcnt] != NULL)
                          {
                              tcnt++;
--- 408,415 ----
                              elog(ERROR, "too few entries in indexprs list");
                          indexkey = (Node *) lfirst(indexpr_item);
                          indexpr_item = lnext(indexpr_item);
                          thisdata->vacattrstats[tcnt] =
!                             examine_attribute(Irel[ind], i + 1, indexkey);
                          if (thisdata->vacattrstats[tcnt] != NULL)
                          {
                              tcnt++;
***************
*** 802,810 ****
   *
   * Determine whether the column is analyzable; if so, create and initialize
   * a VacAttrStats struct for it.  If not, return NULL.
   */
  static VacAttrStats *
! examine_attribute(Relation onerel, int attnum)
  {
      Form_pg_attribute attr = onerel->rd_att->attrs[attnum - 1];
      HeapTuple    typtuple;
--- 790,801 ----
   *
   * Determine whether the column is analyzable; if so, create and initialize
   * a VacAttrStats struct for it.  If not, return NULL.
+  *
+  * If index_expr isn't NULL, then we're trying to analyze an expression index,
+  * and index_expr is the expression tree representing the column's data.
   */
  static VacAttrStats *
! examine_attribute(Relation onerel, int attnum, Node *index_expr)
  {
      Form_pg_attribute attr = onerel->rd_att->attrs[attnum - 1];
      HeapTuple    typtuple;
***************
*** 827,835 ****
      stats = (VacAttrStats *) palloc0(sizeof(VacAttrStats));
      stats->attr = (Form_pg_attribute) palloc(ATTRIBUTE_FIXED_PART_SIZE);
      memcpy(stats->attr, attr, ATTRIBUTE_FIXED_PART_SIZE);
!     typtuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(attr->atttypid));
      if (!HeapTupleIsValid(typtuple))
!         elog(ERROR, "cache lookup failed for type %u", attr->atttypid);
      stats->attrtype = (Form_pg_type) palloc(sizeof(FormData_pg_type));
      memcpy(stats->attrtype, GETSTRUCT(typtuple), sizeof(FormData_pg_type));
      ReleaseSysCache(typtuple);
--- 818,847 ----
      stats = (VacAttrStats *) palloc0(sizeof(VacAttrStats));
      stats->attr = (Form_pg_attribute) palloc(ATTRIBUTE_FIXED_PART_SIZE);
      memcpy(stats->attr, attr, ATTRIBUTE_FIXED_PART_SIZE);
!
!     /*
!      * When analyzing an expression index, believe the expression tree's type
!      * not the column datatype --- the latter might be the opckeytype storage
!      * type of the opclass, which is not interesting for our purposes.  (Note:
!      * if we did anything with non-expression index columns, we'd need to
!      * figure out where to get the correct type info from, but for now that's
!      * not a problem.)  It's not clear whether anyone will care about the
!      * typmod, but we store that too just in case.
!      */
!     if (index_expr)
!     {
!         stats->attrtypid = exprType(index_expr);
!         stats->attrtypmod = exprTypmod(index_expr);
!     }
!     else
!     {
!         stats->attrtypid = attr->atttypid;
!         stats->attrtypmod = attr->atttypmod;
!     }
!
!     typtuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(stats->attrtypid));
      if (!HeapTupleIsValid(typtuple))
!         elog(ERROR, "cache lookup failed for type %u", stats->attrtypid);
      stats->attrtype = (Form_pg_type) palloc(sizeof(FormData_pg_type));
      memcpy(stats->attrtype, GETSTRUCT(typtuple), sizeof(FormData_pg_type));
      ReleaseSysCache(typtuple);
***************
*** 838,849 ****

      /*
       * The fields describing the stats->stavalues[n] element types default to
!      * the type of the field being analyzed, but the type-specific typanalyze
       * function can change them if it wants to store something else.
       */
      for (i = 0; i < STATISTIC_NUM_SLOTS; i++)
      {
!         stats->statypid[i] = stats->attr->atttypid;
          stats->statyplen[i] = stats->attrtype->typlen;
          stats->statypbyval[i] = stats->attrtype->typbyval;
          stats->statypalign[i] = stats->attrtype->typalign;
--- 850,861 ----

      /*
       * The fields describing the stats->stavalues[n] element types default to
!      * the type of the data being analyzed, but the type-specific typanalyze
       * function can change them if it wants to store something else.
       */
      for (i = 0; i < STATISTIC_NUM_SLOTS; i++)
      {
!         stats->statypid[i] = stats->attrtypid;
          stats->statyplen[i] = stats->attrtype->typlen;
          stats->statypbyval[i] = stats->attrtype->typbyval;
          stats->statypalign[i] = stats->attrtype->typalign;
***************
*** 1780,1786 ****
          attr->attstattarget = default_statistics_target;

      /* Look for default "<" and "=" operators for column's type */
!     get_sort_group_operators(attr->atttypid,
                               false, false, false,
                               <opr, &eqopr, NULL);

--- 1792,1798 ----
          attr->attstattarget = default_statistics_target;

      /* Look for default "<" and "=" operators for column's type */
!     get_sort_group_operators(stats->attrtypid,
                               false, false, false,
                               <opr, &eqopr, NULL);

***************
*** 1860,1869 ****
      int            nonnull_cnt = 0;
      int            toowide_cnt = 0;
      double        total_width = 0;
!     bool        is_varlena = (!stats->attr->attbyval &&
!                               stats->attr->attlen == -1);
!     bool        is_varwidth = (!stats->attr->attbyval &&
!                                stats->attr->attlen < 0);
      FmgrInfo    f_cmpeq;
      typedef struct
      {
--- 1872,1881 ----
      int            nonnull_cnt = 0;
      int            toowide_cnt = 0;
      double        total_width = 0;
!     bool        is_varlena = (!stats->attrtype->typbyval &&
!                               stats->attrtype->typlen == -1);
!     bool        is_varwidth = (!stats->attrtype->typbyval &&
!                                stats->attrtype->typlen < 0);
      FmgrInfo    f_cmpeq;
      typedef struct
      {
***************
*** 2126,2133 ****
              for (i = 0; i < num_mcv; i++)
              {
                  mcv_values[i] = datumCopy(track[i].value,
!                                           stats->attr->attbyval,
!                                           stats->attr->attlen);
                  mcv_freqs[i] = (double) track[i].count / (double) samplerows;
              }
              MemoryContextSwitchTo(old_context);
--- 2138,2145 ----
              for (i = 0; i < num_mcv; i++)
              {
                  mcv_values[i] = datumCopy(track[i].value,
!                                           stats->attrtype->typbyval,
!                                           stats->attrtype->typlen);
                  mcv_freqs[i] = (double) track[i].count / (double) samplerows;
              }
              MemoryContextSwitchTo(old_context);
***************
*** 2184,2193 ****
      int            nonnull_cnt = 0;
      int            toowide_cnt = 0;
      double        total_width = 0;
!     bool        is_varlena = (!stats->attr->attbyval &&
!                               stats->attr->attlen == -1);
!     bool        is_varwidth = (!stats->attr->attbyval &&
!                                stats->attr->attlen < 0);
      double        corr_xysum;
      Oid            cmpFn;
      int            cmpFlags;
--- 2196,2205 ----
      int            nonnull_cnt = 0;
      int            toowide_cnt = 0;
      double        total_width = 0;
!     bool        is_varlena = (!stats->attrtype->typbyval &&
!                               stats->attrtype->typlen == -1);
!     bool        is_varwidth = (!stats->attrtype->typbyval &&
!                                stats->attrtype->typlen < 0);
      double        corr_xysum;
      Oid            cmpFn;
      int            cmpFlags;
***************
*** 2476,2483 ****
              for (i = 0; i < num_mcv; i++)
              {
                  mcv_values[i] = datumCopy(values[track[i].first].value,
!                                           stats->attr->attbyval,
!                                           stats->attr->attlen);
                  mcv_freqs[i] = (double) track[i].count / (double) samplerows;
              }
              MemoryContextSwitchTo(old_context);
--- 2488,2495 ----
              for (i = 0; i < num_mcv; i++)
              {
                  mcv_values[i] = datumCopy(values[track[i].first].value,
!                                           stats->attrtype->typbyval,
!                                           stats->attrtype->typlen);
                  mcv_freqs[i] = (double) track[i].count / (double) samplerows;
              }
              MemoryContextSwitchTo(old_context);
***************
*** 2583,2590 ****
              for (i = 0; i < num_hist; i++)
              {
                  hist_values[i] = datumCopy(values[pos].value,
!                                            stats->attr->attbyval,
!                                            stats->attr->attlen);
                  pos += delta;
                  posfrac += deltafrac;
                  if (posfrac >= (num_hist - 1))
--- 2595,2602 ----
              for (i = 0; i < num_hist; i++)
              {
                  hist_values[i] = datumCopy(values[pos].value,
!                                            stats->attrtype->typbyval,
!                                            stats->attrtype->typlen);
                  pos += delta;
                  posfrac += deltafrac;
                  if (posfrac >= (num_hist - 1))
Index: src/include/commands/vacuum.h
===================================================================
RCS file: /cvsroot/pgsql/src/include/commands/vacuum.h,v
retrieving revision 1.89
diff -c -r1.89 vacuum.h
*** src/include/commands/vacuum.h    9 Feb 2010 21:43:30 -0000    1.89
--- src/include/commands/vacuum.h    1 Aug 2010 19:56:12 -0000
***************
*** 62,70 ****
      /*
       * These fields are set up by the main ANALYZE code before invoking the
       * type-specific typanalyze function.
       */
      Form_pg_attribute attr;        /* copy of pg_attribute row for column */
!     Form_pg_type attrtype;        /* copy of pg_type row for column */
      MemoryContext anl_context;    /* where to save long-lived data */

      /*
--- 62,78 ----
      /*
       * These fields are set up by the main ANALYZE code before invoking the
       * type-specific typanalyze function.
+      *
+      * Note: do not assume that the data being analyzed has the same datatype
+      * shown in attr, ie do not trust attr->atttypid, attlen, etc.  This is
+      * because some index opclasses store a different type than the underlying
+      * column/expression.  Instead use attrtypid, attrtypmod, and attrtype for
+      * information about the datatype being fed to the typanalyze function.
       */
      Form_pg_attribute attr;        /* copy of pg_attribute row for column */
!     Oid            attrtypid;        /* type of data being analyzed */
!     int32        attrtypmod;        /* typmod of data being analyzed */
!     Form_pg_type attrtype;        /* copy of pg_type row for attrtypid */
      MemoryContext anl_context;    /* where to save long-lived data */

      /*
***************
*** 95,104 ****

      /*
       * These fields describe the stavalues[n] element types. They will be
!      * initialized to be the same as the column's that's underlying the slot,
!      * but a custom typanalyze function might want to store an array of
!      * something other than the analyzed column's elements. It should then
!      * overwrite these fields.
       */
      Oid            statypid[STATISTIC_NUM_SLOTS];
      int2        statyplen[STATISTIC_NUM_SLOTS];
--- 103,111 ----

      /*
       * These fields describe the stavalues[n] element types. They will be
!      * initialized to match attrtypid, but a custom typanalyze function might
!      * want to store an array of something other than the analyzed column's
!      * elements. It should then overwrite these fields.
       */
      Oid            statypid[STATISTIC_NUM_SLOTS];
      int2        statyplen[STATISTIC_NUM_SLOTS];