Re: Statistics collection question - Mailing list pgsql-general
From | Phoenix Kiula |
---|---|
Subject | Re: Statistics collection question |
Date | |
Msg-id | e373d31e0709030905h59efcebbv7a1d3c0cdab12640@mail.gmail.com Whole thread Raw |
In response to | Re: Statistics collection question (Alban Hertroys <alban@magproductions.nl>) |
Responses |
Re: Statistics collection question
|
List | pgsql-general |
On 03/09/07, Alban Hertroys <alban@magproductions.nl> wrote: > Phoenix Kiula wrote: > As I understand it it's a sample of how the data is distributed. > Probably it's based on statistical mathematics that specifies a minimum > size for a representive sample of a given data set. It boils down to: > "If you want to know how many people like vanilla ice cream, how many > people do you need to ask their preference?". Thanks for this explanation. So I should set the statistics on my indexed column to 10 or so? I made it 1000 this morning, trying to see how it would affect performance. > That's definitely not normal. I have a smallish table here containing > 2.5 million records, and querying for one with a specific index takes > 141 micro(!) seconds. The hardware involved is a dual opteron with 4G, > in a xen domain; I don't know what disks are used, but I doubt they're > raptors. > > So something is wrong with your setup, that much is obvious. I sincerely > doubt that postgres is to blame here. > > You did check that you're not connecting through the internet and > getting a DNS timeout? I am getting these times from the postgres log (pglog). I have setup the minimum query time as 5000 (ms). Here is an except from my log...which is constantly updated with more and more of these! Here's an excerpt from the log. It looks abysmal!! ------------------- LOG: duration: 85865.904 ms statement: select t_info, dstats, id from trades where t_alias = '1q8bf' and status = 'Y' LOG: duration: 83859.505 ms statement: select t_info, dstats, id from trades where t_alias = '1a7iv' and status = 'Y' LOG: duration: 71922.423 ms statement: select t_info, dstats, id from trades where t_alias = 'bvu' and status = 'Y' LOG: duration: 74924.741 ms statement: select t_info, dstats, id from trades where t_alias = 'nt3g' and status = 'Y' LOG: duration: 82471.036 ms statement: select t_info, dstats, id from trades where t_alias = '15p8m' and status = 'Y' LOG: duration: 90015.410 ms statement: select t_info, dstats, id from trades where t_alias = 'pkfi' and status = 'Y' LOG: duration: 72713.815 ms statement: select t_info, dstats, id from trades where t_alias = 'evdi' and status = 'Y' LOG: duration: 88054.444 ms statement: select t_info, dstats, id from trades where t_alias = '1a8zj' and status = 'Y' LOG: duration: 94502.678 ms statement: select t_info, dstats, id from trades where t_alias = '1d188' and status = 'Y' LOG: duration: 82178.724 ms statement: select t_info, dstats, id from trades where t_alias = 'q8zu' and status = 'Y' LOG: duration: 107030.741 ms statement: select t_info, dstats, id from trades where t_alias = 'jnzu' and status = 'Y' LOG: duration: 87634.723 ms statement: select t_info, dstats, id from trades where t_alias = 'tav9' and status = 'Y' LOG: duration: 104271.695 ms statement: select t_info, dstats, id from trades where t_alias = '37tk7' and status = 'Y' LOG: duration: 88726.671 ms statement: select t_info, dstats, id from trades where t_alias = 'tavc' and status = 'Y' LOG: duration: 74710.120 ms statement: select t_info, dstats, id from trades where t_alias = '1q8zu' and status = 'Y' LOG: duration: 93100.863 ms statement: select t_info, dstats, id from trades where t_alias = '1ovmc' and status = 'Y' LOG: duration: 83659.489 ms statement: select t_info, dstats, id from trades where t_alias = '1p9ub' and status = 'Y' LOG: duration: 71963.413 ms statement: select t_info, dstats, id from trades where t_alias = '9awlia' and status = 'Y' LOG: duration: 83569.602 ms statement: select t_info, dstats, id from trades where t_alias = '2yeza' and status = 'Y' LOG: duration: 93473.282 ms statement: select t_info, dstats, id from trades where t_alias = '17huv' and status = 'Y' ----------------------- By way of an explanation, the T_INFO is a text column, DSTATS is char(1), and ID is the bigint primary key. Status can be 'Y' or 'N', so I have not included it in the index (not selective enough) but T_ALIAS is the unique index. The EXPLAIN ANALYZE output is as follows: MYUSER=# explain analyze select t_info, dstats, id from trades where t_alias = '17huv' and status = 'Y'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Index Scan using trades_unique_t_alias on trades (cost=0.00..3.41 rows=1 width=110) (actual time=0.100..0.104 rows=1 loops=1) Index Cond: ((t_alias)::text = '17huv'::text) Filter: (status = 'Y'::bpchar) Total runtime: 0.166 ms (4 rows) Time: 2.990 ms And my postgresql.conf is looking like this: max_connections = 350 shared_buffers = 21000 # Not much more than 20k...http://www.revsys.com/writings/postgresql-performance.html effective_cache_size = 128000 max_fsm_relations = 100 max_fsm_pages = 150000 work_mem = 16000 # http://www.revsys.com/writings/postgresql-performance.html temp_buffers = 4096 authentication_timeout = 10s ssl = off autovacuum = on vacuum_cost_delay = 20 stats_start_collector = on stats_row_level = on autovacuum_vacuum_threshold = 300 autovacuum_analyze_threshold = 100 wal_buffers = 64 checkpoint_segments = 128 checkpoint_timeout = 900 fsync = on maintenance_work_mem = 128MB enable_indexscan = on enable_bitmapscan = off ####random_page_cost = 1.5 Any thoughts? I tried a "random_page_cost" of 1.5 and 2 -- I understand that keeping it at 1.5 would enable most of the data to be in memory and I have 4GB of RAM -- but this made some of the queries abysmally slow.
pgsql-general by date: