Re: Indexes? - Mailing list pgsql-general
From | Mike Leahy |
---|---|
Subject | Re: Indexes? |
Date | |
Msg-id | 1066213284.3f8d1fa4d6e73@www.nexusmail.uwaterloo.ca Whole thread Raw |
In response to | Re: Indexes? (Stephan Szabo <sszabo@megazone.bigpanda.com>) |
Responses |
Re: Indexes?
|
List | pgsql-general |
Stephan, thanks for the response. I increased those variables you suggested, and that seems to have increased the memory allocated to the process in windows. Also, I tried the same query I was using, but with some actual values specified in the where statement - that got it to use the indexes. The only thing is, I would normally be joining such a statement to another table, in which case there wouldn't be a where statement. I don't think that it uses indexes in that case, even if the number of rows being used are a fraction of what's in the table. Regarding the vacuum results, here they are: INFO: --Relation public.tbl_censo_poblacion_1993-- INFO: Pages 283669: Changed 0, Empty 0; Tup 2553015: Vac 0, Keep 0, UnUsed 0. 150106 Total CPU 5.89s/2.90u sec elapsed 56.52 sec. VACUUM6 What do you make of these results? Thanks for your help. Mike -----Original Message----- From: Stephan Szabo [mailto:sszabo@megazone.bigpanda.com] Sent: October 15, 2003 12:11 AM To: Mike Leahy Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Indexes? On Tue, 14 Oct 2003, Mike Leahy wrote: > CREATE INDEX tbl_censo_poblacion_1993_manzana_idx ON tbl_censo_poblacion_1993 > ( dubicacion, zona, manzana ); > > Then I try a query such as: > > select count(*) as POBLACION, (dubicacion || zona || manzana) as COD_MANZANA > from tbl_censo_poblacion_1993 group by dubicacion, zona, manzana; > > The results from explain indicate that a sequential scan is used (as far as I > can tell). I tried adding where statement: > > select count(*) as POBLACION, (dubicacion || zona || manzana) as COD_MANZANA > from tbl_censo_poblacion_1993 where dubicacion <> '' and zona <> '' and > manzana <> '' group by dubicacion, zona, manzana; > > The EXPLAIN analysis results appear as follows: > > Aggregate (cost=847587.90..879024.28 rows=251491 width=27) (actual > time=272782.00..279458.00 rows=21459 loops=1) > -> Group (cost=847587.90..872737.01 rows=2514911 width=27) (actual > time=272782.00..278546.00 rows=2553015 loops=1) > -> Sort (cost=847587.90..853875.18 rows=2514911 width=27) (actual > time=272782.00..274533.00 rows=2553015 loop > s=1) > Sort Key: dubicacion, zona, manzana > -> Seq Scan on tbl_censo_poblacion_1993 (cost=0.00..328346.76 > rows=2514911 width=27) (actual time=0.00. .189570.00 rows=2553015 loops=1) Note how many rows it's estimating (and actually) getting matches for. It appears to be matching nearly every row so indexes aren't going to really help much here for PostgreSQL since it still has to go back to the actual table to find out whether or not the row is visible to your transaction. What does vacuum verbose tbl_censo_poblacion_1993; say? Specifically, how many pages does the table have? > As an aside, is there any way to increase the amount of memory allocated to > the postmaster.exe process? It seems to me that if I could add more than 4MB > that it has by default, then maybe that could increase the performance. You might want to raise both shared_buffers and sort_mem from their default values in postgresql.conf. For more information you should probably read: http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html ---------------------------------------- This mail sent through www.mywaterloo.ca
pgsql-general by date: