Thread: 7.4 - basic tuning question

7.4 - basic tuning question

From
Simon Waters
Date:
Hi,

we have a log table on one server with 1.9 million records.

One column "event" (type text) in that table is a string that (currently)
takes a small number of distinct values (~43) (hmm that could have been
normalised better).

We noted on querying for events of a specific type, that the queries were
slower than expected. It simply wasn't using the index (btree, default
settings) on this column on this server (the test server, with less records,
was fine).

Using "ALTER TABLE SET STATISTICS" to increase the number of buckets to 50
resolved the issue, we went pretty much straight there on discovering there
are no "HINTS".

However we aren't quite sure why this case was pathological, and my brain
doesn't grok the documentation quite.

I assume that the histogram_bounds for strings are alphabetical in order, so
that "DEMOSTART" falls between "DELETE" and "IDEMAIL". Even on a worst case
of including both these common values, the planner ought to have assumed that
less than <10% of records were likely covered by the value selected, so it
seems unlikely to me that not using the index would be a good idea.

What am I missing? (and yes there is a plan to upgrade!).


=> SELECT COUNT(*) FROM log WHERE event='DEMOSTART';
(...lots of time passes...)
 count
-------
  1432
(1 row)


=> SELECT COUNT(*), event FROM log GROUP BY event ORDER BY count;

  count |   event
--------+-----------
      6 | DNRFAIL
     14 | ADMDNR
     14 | UPGRADE
     18 | FOCRENEW
     21 | AUTOCN
     25 | ADMCC
     27 | TEMPIN
     31 | DNRCANCEL
     43 | EXPIRED
    128 | DIRECTBUY
    130 | CANCEL
    130 | CANCELQ
    154 | FOCBUY
    173 | EXPCCWARN
    179 | OFFER
    209 | DNROK
    214 | TEMPRE
    356 | CCWARN
    429 | ADMLOGIN
    719 | SUBSCRIBE
    787 | CCSUCCESS
    988 | CCFAILURE
   1217 | TEMPNEW
   1298 | PAYPAL
   1431 | DEMOSTART
   1776 | CCREQUEST
   2474 | ACCTUPD
  15169 | SYSMAINT
  42251 | IDEMAIL
  46964 | DELETE
  50764 | RELOGIN
  57022 | NEWUSR
  64907 | PUBREC0
  65449 | UNPUBLISH
  92843 | LOGOUT
  99018 | KILLSESS
 128900 | UPLOAD
 134994 | LOGIN
 137608 | NEWPAGE
 447556 | PUBREC1
 489572 | PUBLISH


=> EXPLAIN SELECT * FROM log WHERE event='DEMOSTART';
                         QUERY PLAN
------------------------------------------------------------
 Seq Scan on log  (cost=0.00..54317.14 rows=20436 width=93)
   Filter: (event = 'DEMOSTART'::text)
(2 rows)


=> ALTER TABLE log ALTER COLUMN events SET STATISTICS 50; ANALYSE
LOG(event);
ALTER TABLE
ANALYZE


=> EXPLAIN SELECT COUNT(*) FROM log WHERE event='DEMOSTART';
                                    QUERY PLAN
----------------------------------------------------------------------------
-------
 Aggregate  (cost=5101.43..5101.43 rows=1 width=0)
   ->  Index Scan using log_event on log  (cost=0.00..5098.15 rows=1310
width=0)
         Index Cond: (event = 'DEMOSTART'::text)
(3 rows)


=> SELECT COUNT(*) FROM log WHERE event='DEMOSTART';
(...almost no time passes...)
 count
-------
  1432
(1 row)


BEFORE
pajax=> select * from pg_stats where tablename = 'log' and attname='event';
 schemaname | tablename | attname | null_frac | avg_width | n_distinct |
most_common_vals                    |
most_common_freqs                         |
histogram_bounds                                       | correlation

------------+-----------+---------+-----------+-----------+------------+--------------------------------------------------------+-------------------------------------------------------------------+---------------------------------------------------------------------------------------------+-------------
 public     | log       | event   |         0 |        10 |         25 |
{PUBLISH,PUBREC1,NEWPAGE,UPLOAD,LOGIN,KILLSESS,LOGOUT} |
{0.257333,0.248333,0.072,0.0696667,0.0613333,0.0543333,0.0506667} |
{ACCTUPD,DELETE,IDEMAIL,NEWUSR,NEWUSR,PUBREC0,PUBREC0,RELOGIN,SYSMAINT,UNPUBLISH,UNPUBLISH} |
0.120881
(1 row)

AFTER
pajax=>  select * from pg_stats where tablename='log' and attname='event';
 schemaname | tablename | attname | null_frac | avg_width | n_distinct |
most_common_vals                                            |
most_common_freqs                                                  |
                       
histogram_bounds                                                                            |
correlation

------------+-----------+---------+-----------+-----------+------------+--------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------
 public     | log       | event   |         0 |        10 |         32 |
{PUBLISH,PUBREC1,NEWPAGE,LOGIN,UPLOAD,KILLSESS,LOGOUT,PUBREC0,UNPUBLISH,NEWUSR,RELOGIN,DELETE,IDEMAIL} |
{0.249067,0.248533,0.0761333,0.0719333,0.0685333,0.0526,0.045,0.0368,0.0348667,0.029,0.0255333,0.0254667,0.0238667} |

{ACCTUPD,ACCTUPD,ACCTUPD,ADMLOGIN,CCREQUEST,CCSUCCESS,DEMOSTART,FOCBUY,PAYPAL,SYSMAINT,SYSMAINT,SYSMAINT,SYSMAINT,SYSMAINT,SYSMAINT,SYSMAINT,SYSMAINT,TEMPNEW,TEMPRE}
|    
0.106671
(1 row)


Re: 7.4 - basic tuning question

From
Richard Huxton
Date:
Simon Waters wrote:

The best advice is to "upgrade at your earliest convenience" with
performance questions and 7.4 - you're missing a *lot* of improvements.
You say you're planning to anyway, and I'd recommend putting effort into
the upgrade rather than waste effort on tuning a system you're leaving.

> I assume that the histogram_bounds for strings are alphabetical in order, so
> that "DEMOSTART" falls between "DELETE" and "IDEMAIL". Even on a worst case
> of including both these common values, the planner ought to have assumed that
> less than <10% of records were likely covered by the value selected, so it
> seems unlikely to me that not using the index would be a good idea.

Well, the real question is how many blocks need to be read to find those
DEMOSTART rows. At some point around 5-10% of the table it's easier just
to read the whole table than go back and fore between index and table.
The precise point will depend on how much RAM you have, disk speeds etc.

> => SELECT COUNT(*) FROM log WHERE event='DEMOSTART';
> (...lots of time passes...)
>  count
> -------
>   1432
> (1 row)

OK, not many. The crucial bit is below though. These are the 10 values
it will hold stats on, and all it knows is that DEMOSTART has less than
57000 entries. OK, it's more complicated than that, but basically there
are values it tracks and everything else. So - it assumes that all other
 values have the same chance of occuring.

> => SELECT COUNT(*), event FROM log GROUP BY event ORDER BY count;
>
>   count |   event
> --------+-----------
[snip]
>   57022 | NEWUSR
>   64907 | PUBREC0
>   65449 | UNPUBLISH
>   92843 | LOGOUT
>   99018 | KILLSESS
>  128900 | UPLOAD
>  134994 | LOGIN
>  137608 | NEWPAGE
>  447556 | PUBREC1
>  489572 | PUBLISH

Which is why it guesses 20436 rows below. If you'd done "SET
enable_seqscan = off" then run the explain again it should have
estimated a cost for the index that was more than 54317.14

> => EXPLAIN SELECT * FROM log WHERE event='DEMOSTART';
>                          QUERY PLAN
> ------------------------------------------------------------
>  Seq Scan on log  (cost=0.00..54317.14 rows=20436 width=93)
>    Filter: (event = 'DEMOSTART'::text)
> (2 rows)
>
>
> => ALTER TABLE log ALTER COLUMN events SET STATISTICS 50; ANALYSE
> LOG(event);
> ALTER TABLE
> ANALYZE
>
>
> => EXPLAIN SELECT COUNT(*) FROM log WHERE event='DEMOSTART';
>                                     QUERY PLAN
> ----------------------------------------------------------------------------
> -------
>  Aggregate  (cost=5101.43..5101.43 rows=1 width=0)
>    ->  Index Scan using log_event on log  (cost=0.00..5098.15 rows=1310
> width=0)
>          Index Cond: (event = 'DEMOSTART'::text)
> (3 rows)

Not bad - now it knows how many rows it will find, and it sees that the
index is cheaper. It's not completely accurate - it uses a statistical
sampling (and of course it's out of date as soon as you update the table).

HTH

--
  Richard Huxton
  Archonet Ltd