Re: Statistics collection question - Mailing list pgsql-general

From Phoenix Kiula
Subject Re: Statistics collection question
Date
Msg-id e373d31e0709030905h59efcebbv7a1d3c0cdab12640@mail.gmail.com
Whole thread Raw
In response to Re: Statistics collection question  (Alban Hertroys <alban@magproductions.nl>)
Responses Re: Statistics collection question
List pgsql-general
On 03/09/07, Alban Hertroys <alban@magproductions.nl> wrote:
> Phoenix Kiula wrote:
> As I understand it it's a sample of how the data is distributed.
> Probably it's based on statistical mathematics that specifies a minimum
> size for a representive sample of a given data set. It boils down to:
> "If you want to know how many people like vanilla ice cream, how many
> people do you need to ask their preference?".


Thanks for this explanation. So I should set the statistics on my
indexed column to 10 or so? I made it 1000 this morning, trying to see
how it would affect performance.



> That's definitely not normal. I have a smallish table here containing
> 2.5 million records, and querying for one with a specific index takes
> 141 micro(!) seconds. The hardware involved is a dual opteron with 4G,
> in a xen domain; I don't know what disks are used, but I doubt they're
> raptors.
>
> So something is wrong with your setup, that much is obvious. I sincerely
> doubt that postgres is to blame here.
>
> You did check that you're not connecting through the internet and
> getting a DNS timeout?



I am getting these times from the postgres log (pglog). I have setup
the minimum query time as 5000 (ms). Here is an except from my
log...which is constantly updated with more and more of these!

Here's an excerpt from  the log. It looks abysmal!!


-------------------

LOG:  duration: 85865.904 ms  statement: select t_info, dstats, id
from trades where t_alias = '1q8bf' and status = 'Y'
LOG:  duration: 83859.505 ms  statement: select t_info, dstats, id
from trades where t_alias = '1a7iv' and status = 'Y'
LOG:  duration: 71922.423 ms  statement: select t_info, dstats, id
from trades where t_alias = 'bvu' and status = 'Y'
LOG:  duration: 74924.741 ms  statement: select t_info, dstats, id
from trades where t_alias = 'nt3g' and status = 'Y'
LOG:  duration: 82471.036 ms  statement: select t_info, dstats, id
from trades where t_alias = '15p8m' and status = 'Y'
LOG:  duration: 90015.410 ms  statement: select t_info, dstats, id
from trades where t_alias = 'pkfi' and status = 'Y'
LOG:  duration: 72713.815 ms  statement: select t_info, dstats, id
from trades where t_alias = 'evdi' and status = 'Y'
LOG:  duration: 88054.444 ms  statement: select t_info, dstats, id
from trades where t_alias = '1a8zj' and status = 'Y'
LOG:  duration: 94502.678 ms  statement: select t_info, dstats, id
from trades where t_alias = '1d188' and status = 'Y'
LOG:  duration: 82178.724 ms  statement: select t_info, dstats, id
from trades where t_alias = 'q8zu' and status = 'Y'
LOG:  duration: 107030.741 ms  statement: select t_info, dstats, id
from trades where t_alias = 'jnzu' and status = 'Y'
LOG:  duration: 87634.723 ms  statement: select t_info, dstats, id
from trades where t_alias = 'tav9' and status = 'Y'
LOG:  duration: 104271.695 ms  statement: select t_info, dstats, id
from trades where t_alias = '37tk7' and status = 'Y'
LOG:  duration: 88726.671 ms  statement: select t_info, dstats, id
from trades where t_alias = 'tavc' and status = 'Y'
LOG:  duration: 74710.120 ms  statement: select t_info, dstats, id
from trades where t_alias = '1q8zu' and status = 'Y'
LOG:  duration: 93100.863 ms  statement: select t_info, dstats, id
from trades where t_alias = '1ovmc' and status = 'Y'
LOG:  duration: 83659.489 ms  statement: select t_info, dstats, id
from trades where t_alias = '1p9ub' and status = 'Y'
LOG:  duration: 71963.413 ms  statement: select t_info, dstats, id
from trades where t_alias = '9awlia' and status = 'Y'
LOG:  duration: 83569.602 ms  statement: select t_info, dstats, id
from trades where t_alias = '2yeza' and status = 'Y'
LOG:  duration: 93473.282 ms  statement: select t_info, dstats, id
from trades where t_alias = '17huv' and status = 'Y'

-----------------------


By way of an explanation, the T_INFO is a text column, DSTATS is
char(1), and ID is the bigint primary key. Status can be 'Y' or 'N',
so I have not included it in the index (not selective enough) but
T_ALIAS is the unique index.

The EXPLAIN ANALYZE output is as follows:


MYUSER=# explain analyze select t_info, dstats, id from trades where
t_alias = '17huv' and status = 'Y';

                                                         QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------
 Index Scan using trades_unique_t_alias on trades  (cost=0.00..3.41
rows=1 width=110) (actual time=0.100..0.104 rows=1 loops=1)
   Index Cond: ((t_alias)::text = '17huv'::text)
   Filter: (status = 'Y'::bpchar)
 Total runtime: 0.166 ms
(4 rows)

Time: 2.990 ms




And my postgresql.conf is looking like this:



max_connections              = 350
shared_buffers               = 21000    # Not much more than
20k...http://www.revsys.com/writings/postgresql-performance.html
effective_cache_size         = 128000
max_fsm_relations            = 100
max_fsm_pages                = 150000
work_mem                     = 16000    #
http://www.revsys.com/writings/postgresql-performance.html
temp_buffers                 = 4096
authentication_timeout       = 10s
ssl                          = off
autovacuum                   = on
vacuum_cost_delay            = 20
stats_start_collector        = on
stats_row_level              = on
autovacuum_vacuum_threshold  = 300
autovacuum_analyze_threshold = 100
wal_buffers                  = 64
checkpoint_segments          = 128
checkpoint_timeout           = 900
fsync                        = on
maintenance_work_mem         = 128MB
enable_indexscan             = on
enable_bitmapscan            = off
####random_page_cost             = 1.5


Any thoughts? I tried a "random_page_cost" of 1.5 and 2 -- I
understand that keeping it at 1.5 would enable most of the data to be
in memory and I have 4GB of RAM -- but this made some of the queries
abysmally slow.

pgsql-general by date:

Previous
From: Alban Hertroys
Date:
Subject: Re: Statistics collection question
Next
From: Henrik
Date:
Subject: Vacuum process idle but hogging memory 8.2.4