Thread: BUG #4812: statistics not available on tables containing only hstore column

The following bug has been logged online:

Bug reference:      4812
Logged by:          George Su
Email address:      george.su@efrontier.com
PostgreSQL version: 8.3.6
Operating system:   SUSE Linux
Description:        statistics not available on tables containing only
hstore column
Details:

If I create a table containing only hstore columns, then no statistics on
row count and page count is available even after analyzing.

safire:9945:postgres# create table tmp_tbl (a hstore);
CREATE TABLE
safire:9945:postgres# insert into tmp_tbl select i::text => i::text from
generate_series(1, 100000) as i;
INSERT 0 100000
safire:9945:postgres# analyze tmp_tbl;
ANALYZE
safire:9945:postgres# select count(*) from tmp_tbl;
 count
--------
 100000
(1 row)

safire:9945:postgres# select relname, relpages, reltuples from pg_class
where relname = 'tmp_tbl';
 relname | relpages | reltuples
---------+----------+-----------
 tmp_tbl |        0 |         0
(1 row)

The statistics shows up if I add a, for example, int column.

Re: BUG #4812: statistics not available on tables containing only hstore column

From
Heikki Linnakangas
Date:
George Su wrote:
> If I create a table containing only hstore columns, then no statistics on
> row count and page count is available even after analyzing.

(As a work-around, running VACUUM on the table will update those stats.)

Hmm, we intentionally skip analyze on tables that have no analyzable
columns, but ISTM as well that relpages and reltuples in pg_class should
still be updated. We have this in analyze.c:

    /*
     * Quit if no analyzable columns
     */
    if (attr_cnt <= 0 && !analyzableindex)
    {
        /*
         * We report that the table is empty; this is just so that the
         * autovacuum code doesn't go nuts trying to get stats about a
         * zero-column table.
         */
        if (update_reltuples)
            pgstat_report_analyze(onerel, 0, 0);
        goto cleanup;
    }

ISTM we should not skip the analyze if run as a stand-alone ANALYZE. I'm
not sure what problem the author of that comment envisioned with
autovacuum, but resetting the live and dead tuple counters doesn't seem
right to me. In the worst case, autovacuum launches auto-analyze on the
table whenever there has been enough update/delete activity, and you
never reach the auto*vacuum* threshold because the dead tuple counter is
always cleared.

Does anyone see a problem with the attached patch, changing the code
above so that in a stand-alone ANALYZE (including auto-analyze), the
table is analyzed to update relpages and reltuples even if there's no
analyzable columns?

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
Index: src/backend/commands/analyze.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/commands/analyze.c,v
retrieving revision 1.136
diff -c -r1.136 analyze.c
*** src/backend/commands/analyze.c    5 May 2009 18:02:11 -0000    1.136
--- src/backend/commands/analyze.c    18 May 2009 11:43:05 -0000
***************
*** 363,381 ****
      }

      /*
!      * Quit if no analyzable columns
       */
!     if (attr_cnt <= 0 && !analyzableindex)
!     {
!         /*
!          * We report that the table is empty; this is just so that the
!          * autovacuum code doesn't go nuts trying to get stats about a
!          * zero-column table.
!          */
!         if (update_reltuples)
!             pgstat_report_analyze(onerel, 0, 0);
          goto cleanup;
-     }

      /*
       * Determine how many rows we need to sample, using the worst case from
--- 363,372 ----
      }

      /*
!      * Quit if no analyzable columns and no pg_class update needed.
       */
!     if (attr_cnt <= 0 && !analyzableindex && !update_reltuples)
          goto cleanup;

      /*
       * Determine how many rows we need to sample, using the worst case from

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> ISTM we should not skip the analyze if run as a stand-alone ANALYZE. I'm
> not sure what problem the author of that comment envisioned with
> autovacuum, but resetting the live and dead tuple counters doesn't seem
> right to me.

I think that code probably originated at a time when palloc(0) threw an
error, and was meant to avoid any corner-case failures in the subsequent
logic.  I can't get very excited about the case --- it seems to me the
*real* fix for this complaint is to give hstore some ANALYZE support ---
but if you've tested that it behaves sanely with no columns I have no
objection to committing it.

            regards, tom lane

Re: BUG #4812: statistics not available on tables containing only hstore column

From
Heikki Linnakangas
Date:
Tom Lane wrote:
> I think that code probably originated at a time when palloc(0) threw an
> error, and was meant to avoid any corner-case failures in the subsequent
> logic.  I can't get very excited about the case --- it seems to me the
> *real* fix for this complaint is to give hstore some ANALYZE support ---

Possibly. The standard type analyze function is not used for hstore
because it doesn't have an equals operator.

> but if you've tested that it behaves sanely with no columns I have no
> objection to committing it.

Committed.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com