Folks,
One chronic problem users encounter is this one:
1. User creates new table
2. User inserts 45 records into new table.
3. Time passes.
4. User creates a query which joins against new table.
5. Planner uses estimate of 1000 rows for the new table.
6. User gets very bad query plan.
Now, I look at this, and ask myself: why didn't autoanalyze kick in at
step 3? After all, this was a table which had 0 rows, we inserted 45
rows, making the table infinitely larger. It should have got on the
autoanalyze list, no?
Well, no. It seems that any table with less than
autovacuum_analyze_threshold rows will NEVER be autoanalyzed. Ever.
postgres=# create table thirty_rows ( val int );
CREATE TABLE ^
postgres=# insert into thirty_rows select i from generate_series(1,30)
as gs(i);
INSERT 0 30
postgres=# create table onetwenty_rows ( val int );
CREATE TABLE
postgres=# insert into onetwenty_rows select i from
generate_series(1,120) as gs(i);
INSERT 0 120
postgres=# create table twocent_rows ( val int );
CREATE TABLE
postgres=# insert into twocent_rows select i from generate_series(1,200)
as gs(i);
... wait 5 min ...
postgres=# select relname, last_autoanalyze from pg_stat_user_tables
where relname like '%_rows'; relname | last_autoanalyze
----------------+-------------------------------thirty_rows |twocent_rows | 2012-10-12
16:46:45.025647-07onetwenty_rows| 2012-10-12 16:46:45.014084-07
postgres=# select * from pg_stats where tablename = 'thirty_rows';schemaname | tablename | attname | inherited |
null_frac| avg_width |
n_distinct | most_common_vals | most_common_freqs | histogram_bounds |
correlation | most_common_elems | most_common_elem_freqs |
elem_count_histogram
(0 rows)
This seems easy to fix. If a table has no stats and has any write stats
at all, it should automatically go on the autoanalyze list. Or if it's
easier, one where last_autoanalyze is null.
Objections/complications/alternatives?
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com