Thread: Table with million rows - and PostgreSQL 9.1 is not using the index
I've a table with >1100000 rows, with streets. I'm making a partial search using zip code, and PostgreSQL is ignoring my ZIP index. I'm sure I'm making some mistake, but I can't see where. The query is: SELECT t2.ID, t2.CEP, t2.COMPLEMENTO, t2.NOME, t2.NOMESEMACENTOS, t2.TIPO, t2.BAIRRO_ID FROM LOCALIDADE t0, LOGRADOURO t2, BAIRRO t1 WHERE t2.CEP LIKE '81630160%' AND ((t1.ID = t2.BAIRRO_ID) AND (t0.ID = t1.LOCALIDADE_ID)) ORDER BY t0.NOME; (for reference, BAIRRO = town, LOCALIDADE = city, LOGRADOURO = street) Here is the result of explain analyze: Sort (cost=11938.72..11938.74 rows=91 width=93) Sort Key: t0.nome -> Nested Loop (cost=0.00..11938.42 rows=91 width=93) -> Nested Loop (cost=0.00..11935.19 rows=91 width=85) -> Seq Scan on logradouro t2 (cost=0.00..11634.42 rows=91 width=81) Filter: ((cep)::text ~~ '81630160%'::text) -> Index Scan using pkbairro on bairro t1 (cost=0.00..3.30 rows=1 width=8) Index Cond: (id = t2.bairro_id) -> Index Scan using pklocalidade on localidade t0 (cost=0.00..0.03 rows=1 width=16) Index Cond: ((id)::text = (t1.localidade_id)::text) I've few tweaks in postgresql.conf: shared_buffers = 2GB temp_buffers = 32MB max_prepared_transactions = 50 work_mem = 32MB maintenance_work_mem = 16MB max_stack_depth = 4MB max_files_per_process = 15000 random_page_cost = 2.0 cpu_tuple_cost = 0.001 cpu_index_tuple_cost = 0.0005 cpu_operator_cost = 0.00025 effective_cache_size = 512MB Everything else is default configuration. This machine is Intel Quad 3.1Ghz, with 8 threads, 8Gig of RAM, 8Gig of Swap, running CentOS 6.3 64bit. Machine is free almost all the time. Thanks for your advice, Edson
I'm no expert on this, but it will likely be more helpful to others if you include the table description with all the indices. On Tue, Dec 4, 2012 at 8:44 PM, Edson Richter <edsonrichter@hotmail.com> wrote: > I've a table with >1100000 rows, with streets. > I'm making a partial search using zip code, and PostgreSQL is ignoring my > ZIP index. > I'm sure I'm making some mistake, but I can't see where. > The query is: > > SELECT t2.ID, t2.CEP, t2.COMPLEMENTO, t2.NOME, t2.NOMESEMACENTOS, t2.TIPO, > t2.BAIRRO_ID > FROM LOCALIDADE t0, LOGRADOURO t2, BAIRRO t1 > WHERE t2.CEP LIKE '81630160%' AND ((t1.ID = t2.BAIRRO_ID) AND (t0.ID = > t1.LOCALIDADE_ID)) ORDER BY t0.NOME; > > (for reference, BAIRRO = town, LOCALIDADE = city, LOGRADOURO = street) > > Here is the result of explain analyze: > > Sort (cost=11938.72..11938.74 rows=91 width=93) > Sort Key: t0.nome > -> Nested Loop (cost=0.00..11938.42 rows=91 width=93) > -> Nested Loop (cost=0.00..11935.19 rows=91 width=85) > -> Seq Scan on logradouro t2 (cost=0.00..11634.42 rows=91 > width=81) > Filter: ((cep)::text ~~ '81630160%'::text) > -> Index Scan using pkbairro on bairro t1 (cost=0.00..3.30 > rows=1 width=8) > Index Cond: (id = t2.bairro_id) > -> Index Scan using pklocalidade on localidade t0 (cost=0.00..0.03 > rows=1 width=16) > Index Cond: ((id)::text = (t1.localidade_id)::text) > > I've few tweaks in postgresql.conf: > shared_buffers = 2GB > temp_buffers = 32MB > max_prepared_transactions = 50 > work_mem = 32MB > maintenance_work_mem = 16MB > max_stack_depth = 4MB > max_files_per_process = 15000 > random_page_cost = 2.0 > cpu_tuple_cost = 0.001 > cpu_index_tuple_cost = 0.0005 > cpu_operator_cost = 0.00025 > effective_cache_size = 512MB > > Everything else is default configuration. > > This machine is Intel Quad 3.1Ghz, with 8 threads, 8Gig of RAM, 8Gig of > Swap, running CentOS 6.3 64bit. > Machine is free almost all the time.
On Wednesday, December 05, 2012 02:44:39 AM Edson Richter wrote: > Sort (cost=11938.72..11938.74 rows=91 width=93) > Sort Key: t0.nome > -> Nested Loop (cost=0.00..11938.42 rows=91 width=93) > -> Nested Loop (cost=0.00..11935.19 rows=91 width=85) > -> Seq Scan on logradouro t2 (cost=0.00..11634.42 > rows=91 width=81) > Filter: ((cep)::text ~~ '81630160%'::text) According to that the logradouro table only has 91 rows, which is why it seq- scanned it. Has it been analyzed? Also, partial text matches require a special index declaration, as I recall. Maybe post a \d of each table to help troubleshoot this.
Hi Edson,
since you are using 'like' in your select, you may want to try the following (example):
CREATE INDEX "MY_LONG_INDEX_NAME_IDX"
ON "MY_TABLE_NAME"
USING btree
("MY_VARCHAR_FIELD_NAME" COLLATE pg_catalog."default" varchar_pattern_ops);
(for TEXT fields, use text_pattern_ops in the index declaration).
I declare all my indexes on string fields that way because MOST of my queries are with like/ilike anyway, and I haven't noticed that the indexes would be bigger than without those clauses - I have tables with up to 3M rows.
Next thing, perhaps your index is declared only for a part of the values in the column (partial index)?
Next, as Alan said, check if the index is up-to-date (reindex), if in doubt drop- and recreate it.
I hope that helps.
On 5 December 2012 06:02, Alan Hodgson <ahodgson@simkin.ca> wrote:
On Wednesday, December 05, 2012 02:44:39 AM Edson Richter wrote:According to that the logradouro table only has 91 rows, which is why it seq-
> Sort (cost=11938.72..11938.74 rows=91 width=93)
> Sort Key: t0.nome
> -> Nested Loop (cost=0.00..11938.42 rows=91 width=93)
> -> Nested Loop (cost=0.00..11935.19 rows=91 width=85)
> -> Seq Scan on logradouro t2 (cost=0.00..11634.42
> rows=91 width=81)
> Filter: ((cep)::text ~~ '81630160%'::text)
scanned it. Has it been analyzed?
Also, partial text matches require a special index declaration, as I recall.
Maybe post a \d of each table to help troubleshoot this.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Em 05/12/2012 09:59, ERR ORR escreveu:
Hi Edson,since you are using 'like' in your select, you may want to try the following (example):CREATE INDEX "MY_LONG_INDEX_NAME_IDX"ON "MY_TABLE_NAME"USING btree("MY_VARCHAR_FIELD_NAME" COLLATE pg_catalog."default" varchar_pattern_ops);(for TEXT fields, use text_pattern_ops in the index declaration).
That's it: you did the trick!
Query time dropped from 250ms to 15ms after correct index! It is a huge improvement, I'll start changing all indexes that I expect to be used with "like" operator to use the varchar_pattern_ops.
Thank you very much!
Edson
I declare all my indexes on string fields that way because MOST of my queries are with like/ilike anyway, and I haven't noticed that the indexes would be bigger than without those clauses - I have tables with up to 3M rows.Next thing, perhaps your index is declared only for a part of the values in the column (partial index)?Next, as Alan said, check if the index is up-to-date (reindex), if in doubt drop- and recreate it.I hope that helps.On 5 December 2012 06:02, Alan Hodgson <ahodgson@simkin.ca> wrote:On Wednesday, December 05, 2012 02:44:39 AM Edson Richter wrote:According to that the logradouro table only has 91 rows, which is why it seq-
> Sort (cost=11938.72..11938.74 rows=91 width=93)
> Sort Key: t0.nome
> -> Nested Loop (cost=0.00..11938.42 rows=91 width=93)
> -> Nested Loop (cost=0.00..11935.19 rows=91 width=85)
> -> Seq Scan on logradouro t2 (cost=0.00..11634.42
> rows=91 width=81)
> Filter: ((cep)::text ~~ '81630160%'::text)
scanned it. Has it been analyzed?
Also, partial text matches require a special index declaration, as I recall.
Maybe post a \d of each table to help troubleshoot this.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general