Thread: Table with million rows - and PostgreSQL 9.1 is not using the index

Table with million rows - and PostgreSQL 9.1 is not using the index

From
Edson Richter
Date:
I've a table with >1100000 rows, with streets.
I'm making a partial search using zip code, and PostgreSQL is ignoring
my ZIP index.
I'm sure I'm making some mistake, but I can't see where.
The query is:

SELECT t2.ID, t2.CEP, t2.COMPLEMENTO, t2.NOME, t2.NOMESEMACENTOS,
t2.TIPO, t2.BAIRRO_ID
   FROM LOCALIDADE t0, LOGRADOURO t2, BAIRRO t1
  WHERE t2.CEP LIKE '81630160%' AND ((t1.ID = t2.BAIRRO_ID) AND (t0.ID =
t1.LOCALIDADE_ID)) ORDER BY t0.NOME;

(for reference, BAIRRO = town, LOCALIDADE = city, LOGRADOURO = street)

Here is the result of explain analyze:

Sort  (cost=11938.72..11938.74 rows=91 width=93)
   Sort Key: t0.nome
   ->  Nested Loop  (cost=0.00..11938.42 rows=91 width=93)
         ->  Nested Loop  (cost=0.00..11935.19 rows=91 width=85)
               ->  Seq Scan on logradouro t2  (cost=0.00..11634.42
rows=91 width=81)
                     Filter: ((cep)::text ~~ '81630160%'::text)
               ->  Index Scan using pkbairro on bairro t1
(cost=0.00..3.30 rows=1 width=8)
                     Index Cond: (id = t2.bairro_id)
         ->  Index Scan using pklocalidade on localidade t0
(cost=0.00..0.03 rows=1 width=16)
               Index Cond: ((id)::text = (t1.localidade_id)::text)

I've few tweaks in postgresql.conf:
shared_buffers = 2GB
temp_buffers = 32MB
max_prepared_transactions = 50
work_mem = 32MB
maintenance_work_mem = 16MB
max_stack_depth = 4MB
max_files_per_process = 15000
random_page_cost = 2.0
cpu_tuple_cost = 0.001
cpu_index_tuple_cost = 0.0005
cpu_operator_cost = 0.00025
effective_cache_size = 512MB

Everything else is default configuration.

This machine is Intel Quad 3.1Ghz, with 8 threads, 8Gig of RAM, 8Gig of
Swap, running CentOS 6.3 64bit.
Machine is free almost all the time.

Thanks for your advice,

Edson


Re: Table with million rows - and PostgreSQL 9.1 is not using the index

From
Lonni J Friedman
Date:
I'm no expert on this, but it will likely be more helpful to others if
you include the table description with all the indices.

On Tue, Dec 4, 2012 at 8:44 PM, Edson Richter <edsonrichter@hotmail.com> wrote:
> I've a table with >1100000 rows, with streets.
> I'm making a partial search using zip code, and PostgreSQL is ignoring my
> ZIP index.
> I'm sure I'm making some mistake, but I can't see where.
> The query is:
>
> SELECT t2.ID, t2.CEP, t2.COMPLEMENTO, t2.NOME, t2.NOMESEMACENTOS, t2.TIPO,
> t2.BAIRRO_ID
>   FROM LOCALIDADE t0, LOGRADOURO t2, BAIRRO t1
>  WHERE t2.CEP LIKE '81630160%' AND ((t1.ID = t2.BAIRRO_ID) AND (t0.ID =
> t1.LOCALIDADE_ID)) ORDER BY t0.NOME;
>
> (for reference, BAIRRO = town, LOCALIDADE = city, LOGRADOURO = street)
>
> Here is the result of explain analyze:
>
> Sort  (cost=11938.72..11938.74 rows=91 width=93)
>   Sort Key: t0.nome
>   ->  Nested Loop  (cost=0.00..11938.42 rows=91 width=93)
>         ->  Nested Loop  (cost=0.00..11935.19 rows=91 width=85)
>               ->  Seq Scan on logradouro t2  (cost=0.00..11634.42 rows=91
> width=81)
>                     Filter: ((cep)::text ~~ '81630160%'::text)
>               ->  Index Scan using pkbairro on bairro t1 (cost=0.00..3.30
> rows=1 width=8)
>                     Index Cond: (id = t2.bairro_id)
>         ->  Index Scan using pklocalidade on localidade t0 (cost=0.00..0.03
> rows=1 width=16)
>               Index Cond: ((id)::text = (t1.localidade_id)::text)
>
> I've few tweaks in postgresql.conf:
> shared_buffers = 2GB
> temp_buffers = 32MB
> max_prepared_transactions = 50
> work_mem = 32MB
> maintenance_work_mem = 16MB
> max_stack_depth = 4MB
> max_files_per_process = 15000
> random_page_cost = 2.0
> cpu_tuple_cost = 0.001
> cpu_index_tuple_cost = 0.0005
> cpu_operator_cost = 0.00025
> effective_cache_size = 512MB
>
> Everything else is default configuration.
>
> This machine is Intel Quad 3.1Ghz, with 8 threads, 8Gig of RAM, 8Gig of
> Swap, running CentOS 6.3 64bit.
> Machine is free almost all the time.


Re: Table with million rows - and PostgreSQL 9.1 is not using the index

From
Alan Hodgson
Date:
On Wednesday, December 05, 2012 02:44:39 AM Edson Richter wrote:
> Sort  (cost=11938.72..11938.74 rows=91 width=93)
>    Sort Key: t0.nome
>    ->  Nested Loop  (cost=0.00..11938.42 rows=91 width=93)
>          ->  Nested Loop  (cost=0.00..11935.19 rows=91 width=85)
>                ->  Seq Scan on logradouro t2  (cost=0.00..11634.42
> rows=91 width=81)
>                      Filter: ((cep)::text ~~ '81630160%'::text)

According to that the logradouro table only has 91 rows, which is why it seq-
scanned it. Has it been analyzed?

Also, partial text matches require a special index declaration, as I recall.
Maybe post a \d of each table to help troubleshoot this.


Re: Table with million rows - and PostgreSQL 9.1 is not using the index

From
ERR ORR
Date:
Hi Edson,

since you are using 'like' in your select, you may want to try the following (example):
CREATE INDEX "MY_LONG_INDEX_NAME_IDX"
  ON "MY_TABLE_NAME"
  USING btree
  ("MY_VARCHAR_FIELD_NAME" COLLATE pg_catalog."default" varchar_pattern_ops);

(for TEXT fields, use text_pattern_ops in the index declaration).

I declare all my indexes on string fields that way because MOST of my queries are with like/ilike anyway, and I haven't noticed that the indexes would be bigger than without those clauses - I have tables with up to 3M rows.

Next thing, perhaps your index is declared only for a part of the values in the column (partial index)?

Next, as Alan said, check if the index is up-to-date (reindex), if in doubt drop- and recreate it.

I hope that helps.


On 5 December 2012 06:02, Alan Hodgson <ahodgson@simkin.ca> wrote:
On Wednesday, December 05, 2012 02:44:39 AM Edson Richter wrote:
> Sort  (cost=11938.72..11938.74 rows=91 width=93)
>    Sort Key: t0.nome
>    ->  Nested Loop  (cost=0.00..11938.42 rows=91 width=93)
>          ->  Nested Loop  (cost=0.00..11935.19 rows=91 width=85)
>                ->  Seq Scan on logradouro t2  (cost=0.00..11634.42
> rows=91 width=81)
>                      Filter: ((cep)::text ~~ '81630160%'::text)

According to that the logradouro table only has 91 rows, which is why it seq-
scanned it. Has it been analyzed?

Also, partial text matches require a special index declaration, as I recall.
Maybe post a \d of each table to help troubleshoot this.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Table with million rows - and PostgreSQL 9.1 is not using the index

From
Edson Richter
Date:
Em 05/12/2012 09:59, ERR ORR escreveu:
Hi Edson,

since you are using 'like' in your select, you may want to try the following (example):
CREATE INDEX "MY_LONG_INDEX_NAME_IDX"
  ON "MY_TABLE_NAME"
  USING btree
  ("MY_VARCHAR_FIELD_NAME" COLLATE pg_catalog."default" varchar_pattern_ops);

(for TEXT fields, use text_pattern_ops in the index declaration).

That's it: you did the trick!

Query time dropped from 250ms to 15ms after correct index! It is a huge improvement, I'll start changing all indexes that I expect to be used with "like" operator to use the varchar_pattern_ops.

Thank you very much!


Edson


I declare all my indexes on string fields that way because MOST of my queries are with like/ilike anyway, and I haven't noticed that the indexes would be bigger than without those clauses - I have tables with up to 3M rows.

Next thing, perhaps your index is declared only for a part of the values in the column (partial index)?

Next, as Alan said, check if the index is up-to-date (reindex), if in doubt drop- and recreate it.

I hope that helps.


On 5 December 2012 06:02, Alan Hodgson <ahodgson@simkin.ca> wrote:
On Wednesday, December 05, 2012 02:44:39 AM Edson Richter wrote:
> Sort  (cost=11938.72..11938.74 rows=91 width=93)
>    Sort Key: t0.nome
>    ->  Nested Loop  (cost=0.00..11938.42 rows=91 width=93)
>          ->  Nested Loop  (cost=0.00..11935.19 rows=91 width=85)
>                ->  Seq Scan on logradouro t2  (cost=0.00..11634.42
> rows=91 width=81)
>                      Filter: ((cep)::text ~~ '81630160%'::text)

According to that the logradouro table only has 91 rows, which is why it seq-
scanned it. Has it been analyzed?

Also, partial text matches require a special index declaration, as I recall.
Maybe post a \d of each table to help troubleshoot this.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general