Re: Query seem to slow if table have more than 200 million rows - Mailing list pgsql-performance
From | Ahmad Fajar |
---|---|
Subject | Re: Query seem to slow if table have more than 200 million rows |
Date | |
Msg-id | SVONERLVbNDN4CxpAOA00000061@ki-communication.com Whole thread Raw |
In response to | Re: Query seem to slow if table have more than 200 million rows ("Qingqing Zhou" <zhouqq@cs.toronto.edu>) |
List | pgsql-performance |
Hi Qingqing, I don't know whether the statistic got is bad or good, this is the statistic: scooby=# select a.relid, a.relname, b.indexrelid, b.indexrelname, c.idx_scan, c.idx_tup_read, c.idx_tup_fetch, scooby-# a.heap_blks_read, a.heap_blks_hit, a.idx_blks_read, a.idx_blks_hit, scooby-# a.toast_blks_read, a.toast_blks_hit, a.tidx_blks_read, a.tidx_blks_hit, b.idx_blks_read, b.idx_blks_hit scooby-# from pg_statio_user_tables a, pg_statio_user_indexes b, pg_stat_all_indexes c scooby-# where a.relid=b.relid and a.relid=c.relid and b.indexrelid=c.indexrelid and a.relname=b.relname and scooby-# a.relname=c.relname and a.relname='fti_dict1'; relid | relname | indexrelid | indexrelname | idx_scan | idx_tup_read | idx_tup_fetch | heap_blks_read | heap_blks_hit | idx _blks_read | idx_blks_hit | toast_blks_read | toast_blks_hit | tidx_blks_read | tidx_blks_hit | idx_blks_read | idx_blks_hit ----------+-----------+------------+--------------+----------+-------------- +---------------+----------------+---------------+---- -----------+--------------+-----------------+----------------+-------------- --+---------------+---------------+-------------- 22880226 | fti_dict1 | 22880231 | idx_dict3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | | | | 0 | 0 22880226 | fti_dict1 | 22880230 | idx_dict2 | 7 | 592799 | 592799 | 0 | 0 | 0 | 0 | | | | | 0 | 0 22880226 | fti_dict1 | 22880229 | idx_dict1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | | | | 0 | 0 (3 rows) I have try several time the query below with different keyword, but I just got idx_tup_read and idx_tup_fetch changed, others keep zero. The Index are: Ids (Idx_dict1), keywords (idx_dict2 varchar_ops), keywords (idx_dict3 varchar_pattern_ops) ==> I use this index for query ... keywords like 'blabla%', just for testing purpose Regards, ahmad fajar -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Qingqing Zhou Sent: Selasa, 27 September 2005 8:43 To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Query seem to slow if table have more than 200 million rows ""Ahmad Fajar"" <gendowo@konphalindo.or.id> wrote > > Select ids, keywords from dict where keywords='blabla' ('blabla' is a > single > word); > > The table have 200 million rows, I have index the keywords field. On the > first time my query seem to slow to get the result, about 15-60 sec to get > the result. But if I repeat the query I will get fast result. My question > is > why on the first time the query seem very slow. > > Table structure is quite simple: > > Ids bigint, keywords varchar(150), weight varchar(1), dpos int. > The first slowness is obviously caused by disk IOs. The second time is faster because all data pages it requires are already in buffer pool. 200 million rows is not a problem for btree index, even if your client tool appends some spaces to your keywords at your insertion time, the ideal btree is 5 to 6 layers high at most. Can you show the iostats of index from your statistics view? http://www.postgresql.org/docs/8.0/static/monitoring-stats.html#MONITORING-S TATS-VIEWS Regards, Qingqing ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend
pgsql-performance by date: