Re: Postgres optimizer choosing wrong index - Mailing list pgsql-general

From Jack Orenstein
Subject Re: Postgres optimizer choosing wrong index
Date
Msg-id 4902123E.30009@hds.com
Whole thread Raw
In response to Re: Postgres optimizer choosing wrong index  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Postgres optimizer choosing wrong index  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Tom Lane wrote:
> Jack Orenstein <jack.orenstein@hds.com> writes:
>> I may have simplified too far. Our application runs a number of
>> different queries. All our WHERE clauses restrict dh and fh. For a
>> given pair of (dh, fh) values, the initial query should come up empty
>> and then insert this pair, and then there is further processing
>> (SELECT, UPDATE). Something is causing a huge number of index row
>> reads (according to pg_stat_user_indexes) but only in tables that have
>> been vacuumed.
>
> Well, that's a bit more concrete but it's still difficult to tell where
> the problem is.  Are you by any chance creating new tables and then
> vacuuming them while they're still empty?  That would cause
> pg_class.relpages to get set to zero, and 7.4.x is not bright enough to
> change plans when you later fill the table (until you do another vacuum
> or analyze).  However, I think 7.4 would always choose a seqscan on a
> table it thinks is zero-size, so I'm not sure that that's what's going
> on here.

I've collected much more detailed information on this problem.

(Recap: In 7.4, table T(int dh, int fh, int nm, ...) has indexes idx_df on (dh,
fh), and idx_dn on (dh, nm). Queries of the form "... where dh = ? and fh = ?"
appear to be using the (dh, nm) index leading to horrendous performance in cases
where there are very few unique dh values, and T has been vacuumed.)

The optimizer is definitely picking the wrong index even when the query
restricts dh to a value actually present in the table. (Yesterday, I was misled
by running EXPLAIN with a value not in the table.) Tom's speculation above is
mostly correct except that, as shown below, the optimizer is not choosing a seqscan.

EXPERIMENT:

- I created two schemas, NOVAC and VAC, each with a table T as described above.

- Before loading data, I ran VACUUM ANALYZE on VAC.T.

- I then started loading data. The workload is a mixture of INSERT,   SELECT and
UPDATE. For SELECT and UPDATE the WHERE clause always includes "dh = ? and fh = ?".

PG_STAT RESULTS:

Stats collection was enabled. Here is sample output during the run, (yes, those
are real numbers, I just happened to run the query when the 2nd row counters hit
those nice round numbers).

     ris=# select schemaname, indexrelname, idx_scan, idx_tup_read,
idx_tup_fetch from pg_stat_user_indexes where schemaname in ('novac', 'vac') and
relname = 'external_file' order by 1, 2;
      schemaname | indexrelname | idx_scan | idx_tup_read | idx_tup_fetch
     ------------+--------------+----------+--------------+---------------
      novac      | idx_dn       |        0 |            0 |             0
      novac      | idx_dh       |    25000 |        20000 |         20000
      vac        | idx_dn       |    16093 |     25991728 |      25991728
      vac        | idx_dh       |        0 |            0 |             0

This shows the different index usage in the two schemas. The schema containing
the vacuumed table used idx_dn exclusively; in the other schema, idx_dh was used
exclusively.

EXPLAIN PLAN RESULTS:

This query shows the distribution of T.dh values in NOVAC:

     ris=# select dh, count(*) from novac.t group by dh;
      dh        | count
     -----------+-------
      280433162 |  5000
      601454890 |  5000

Using one of these dh values, the optimizer selects idx_df as I expected (the fh
values are nearly unique; 0 is the median of the possible values):

     ris=# explain select * from novac.t where dh = 280433162 and fh = 0;
                                         QUERY PLAN

     ------------------------------------------------------------------
      Index Scan using idx_df on t  (cost=0.00..4.83 rows=1 width=454)
        Index Cond: ((dh = 280433162) AND (fh = 0))
     (2 rows)

But in the VAC schema:

     ris=# select dir_hash, count(*) from vac.t group by dh;
      dir_hash  | count
     -----------+-------
      758082190 |  5000
      980351022 |  5000
     (2 rows)

     ris=# explain select * from vac.t where dh = 758082190 and fh = 0;
                                                 QUERY PLAN

     ------------------------------------------------------------------
      Index Scan using idx_dn on t  (cost=0.00..3.68 rows=1 width=454)
        Index Cond: (dh = 758082190)
        Filter: (fh = 0)
     (3 rows)

Here are page and tuple counts:

     ris=# select n.nspname, relname, relpages, reltuples from pg_class,
pg_namespace n where relname = 't' and relnamespace = n.oid;
      nspname  | relname | relpages | reltuples
     ----------+---------+----------+-----------
      novac    | t       |       10 |      1000
      vac      | t       |        0 |         0

 From Tom Lane's earlier email, the vacuum prior to the load explains the 0s in
the second row.

pg_stats has no data for table T in either schema. In the case of the VAC
schema, I'm guessing this is because the table was actually empty when analyzed.


CONCLUSION:

The optimizer is objectively making the wrong choice. The pg_stat results
included above, (as well as the block-level counts), show that the idx_df index
should have been used. While the dh part of the index is not highly selective,
the fh part is. And with very high ratios of fh to dh values, (as in my
experiment), it matters a lot. The transaction rates measured by the test
program make this clear even without looking at pg_stat numbers.

Admittedly, the optimizer doesn't have much to go on. But I would think that
with a WHERE clause of the form "dh = ? and fh = ?", choosing the (dh, fh) index
over any other index would be a reasonable heuristic.


Jack


pgsql-general by date:

Previous
From: Scott Ribe
Date:
Subject: Re: max time in a table query takes ages
Next
From: "Grzegorz Jaśkiewicz"
Date:
Subject: Re: max time in a table query takes ages