Re: Postgres optimizer choosing wrong index - Mailing list pgsql-general

From Jack Orenstein
Subject Re: Postgres optimizer choosing wrong index
Date
Msg-id 49062D08.1020309@hds.com
Whole thread Raw
In response to Re: Postgres optimizer choosing wrong index  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Postgres optimizer choosing wrong index  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Tom Lane wrote:
> Jack Orenstein <jack.orenstein@hds.com> writes:
>> - I created two schemas, NOVAC and VAC, each with a table T as described above.
>
>> - Before loading data, I ran VACUUM ANALYZE on VAC.T.
>
>> - I then started loading data. The workload is a mixture of INSERT,   SELECT and
>> UPDATE. For SELECT and UPDATE the WHERE clause always includes "dh = ? and fh = ?".
>
> Basically your problem here is that vacuum records the size of the table
> as zero (in pg_class.relpages/reltuples) and that causes the computed
> costs of the two indexscans to be exactly the same, so it's a tossup
> which one gets used.  (In recent versions I think the index with higher
> OID would typically get chosen in a tie, but I forget if 7.4 worked that
> way.)
>
> 8.0 and up are smart enough not to believe pg_class.relpages anymore
> after you've loaded a lot of data, but 7.4 isn't.  In testing similar
> cases here, I get reasonable cost estimates and a sane plan choice
> from 7.4 so long as the stats are up to date.
>
> Bottom line: you need to vacuum (or preferably analyze) *after*
> initially populating a table, not before.

OK, I've added this behavior to my application. As the table is being loaded, I
run VACUUM ANALYZE every 500 inserts, until we get to size 10,000. I know this
is working because of application-level logging, and because I see relpages and
reltuples go up.

EXPLAIN says that the correct index is being used -- it didn't used to. However,
pg_stat* says otherwise. In my test, I have exactly one dh value. Running
EXPLAIN with this value produces a plan using idx_dh (the correct index), but
pg_stats says that idx_dn is being used (see psql session below).

This eventually works itself out. Eventually, the pg_stats for idx_dh start
going up, showing that that index is eventually being used. But this discrepancy
between EXPLAIN and actual query execution is making life very difficult.

Is the discrepancy between EXPLAIN and pg_stats due to some sort of caching per
connection? E.g., a connection that uses one plan for a query is stuck with that
plan for that query?

What would really be nice is a logging option that reported the execution plan
actually used for a query.

Jack



ris=# \d t;
                Table "vac.t"
     Column    |          Type          | Modifiers
--------------+------------------------+-----------
  dh           | integer                | not null
  fh           | integer                | not null
  nm           | bigint                 |
...
Indexes:
     "idx_dn" btree (dh, nm)
     "idx_dh" btree (dh, fh)

ris=# select dh, count(*) from t group by dh;
  dh        | count
-----------+-------
  589849733 | 19890
(1 row)

ris=# explain select * from t where dh = 589849733 and fh = 0;
                                     QUERY PLAN
------------------------------------------------------------------
  Index Scan using idx_dh on t  (cost=0.00..5.26 rows=2 width=570)
    Index Cond: ((dh = 589849733) AND (fh = 0))
(2 rows)


ris=# select schemaname, indexrelname, idx_scan, idx_tup_read, idx_tup_fetch
from pg_stat_user_indexes where schemaname = 'vac' and relname = 't' order by 1, 2;
  schemaname | indexrelname | idx_scan | idx_tup_read | idx_tup_fetch
------------+--------------+----------+--------------+---------------
  vac        | idx_dn       |    31315 |    122773990 |     122773990
  vac        | idx_dh       |        0 |            0 |             0



pgsql-general by date:

Previous
From: Chris Browne
Date:
Subject: Re: Execute Shell script after insert
Next
From: Bill Thoen
Date:
Subject: How Do I Find the Date When A Table Was Last Changed?