Home > mailing lists

Re: Indexes? - Mailing list pgsql-general

From	Martijn van Oosterhout
Subject	Re: Indexes?
Date	October 15, 2003 02:29:45
Msg-id	20031015022813.GA22182@svana.org Whole thread Raw
In response to	Indexes? (Mike Leahy <mgleahy@fes.uwaterloo.ca>)
List	pgsql-general

Tree view

In general, the <> operator can't be indexed since it will match most rows,
in which case you're better off scanning the table anyway.

If you really run that query often, perhaps the answer is a partial index.

On Tue, Oct 14, 2003 at 09:57:05PM -0400, Mike Leahy wrote:
> Hello all,
>
> I have a question related to the use of indexes.  One of my tables is part of
> a census, with over 2.5 million records in it.  Clearly, just about any query
> takes a rather long time to complete.  I was hoping to reduce the amount of
> time by using indexes.  However, no matter what I do, PostgreSQL never seems
> to use them.  It would seem to make sense that if I calculate something
> grouped by a set of variables that have an index created for them, that the
> index should be used rather than a sequential search, but it doesn't.  I have
> a table with an index such as the one created below:
>
> CREATE INDEX tbl_censo_poblacion_1993_manzana_idx ON tbl_censo_poblacion_1993
> ( dubicacion, zona, manzana );
>
> Then I try a query such as:
>
> select count(*) as POBLACION, (dubicacion || zona || manzana) as COD_MANZANA
> from tbl_censo_poblacion_1993 group by dubicacion, zona, manzana;
>
> The results from explain indicate that a sequential scan is used (as far as I
> can tell).  I tried adding where statement:
>
> select count(*) as POBLACION, (dubicacion || zona || manzana) as COD_MANZANA
> from tbl_censo_poblacion_1993 where dubicacion <> '' and zona <> '' and
> manzana <> '' group by dubicacion, zona, manzana;
>
> The EXPLAIN analysis results appear as follows:
>
> Aggregate  (cost=847587.90..879024.28 rows=251491 width=27) (actual
> time=272782.00..279458.00 rows=21459 loops=1)
>    ->  Group  (cost=847587.90..872737.01 rows=2514911 width=27) (actual
> time=272782.00..278546.00 rows=2553015 loops=1)
>          ->  Sort  (cost=847587.90..853875.18 rows=2514911 width=27) (actual
> time=272782.00..274533.00 rows=2553015 loop
> s=1)
>                Sort Key: dubicacion, zona, manzana
>                ->  Seq Scan on tbl_censo_poblacion_1993  (cost=0.00..328346.76
> rows=2514911 width=27) (actual time=0.00. .189570.00 rows=2553015 loops=1)
>                      Filter: ((dubicacion <> ''::character varying) AND (zona
> <> ''::character varying) AND (manzana <> ''::character varying))  Total
> runtime: 279494.00 msec (7 rows)
>
>
> So...can anyone suggest to me what I could do?  I'm using PostgreSQL 7.3.4-2
> in Cygwin on a WinXP platform.
>
> As an aside, is there any way to increase the amount of memory allocated to
> the postmaster.exe process?  It seems to me that if I could add more than 4MB
> that it has by default, then maybe that could increase the performance.
>
> Any help is appreciated.
>
> Kind regards,
> Mike
>
>
>
>
> ----------------------------------------
> This mail sent through www.mywaterloo.ca
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> "All that is needed for the forces of evil to triumph is for enough good
> men to do nothing." - Edmond Burke
> "The penalty good people pay for not being interested in politics is to be
> governed by people worse than themselves." - Plato

Attachment

msg-32091-68852.dat

pgsql-general by date:

From: Mike Leahy
Date: 15 October 2003, 01:57:35
Subject: Indexes?

From: Greg Stark
Date: 15 October 2003, 03:09:19
Subject: Re: Question

Re: Indexes? - Mailing list pgsql-general

Attachment

Previous

Next