Re: R: Index on integer or on string field - Mailing list pgsql-general

From William Dunn
Subject Re: R: Index on integer or on string field
Date
Msg-id CAEva=Vm8WJzu1KDOrEA9PJZK2jtJM849fVRnBuqoRZbE-f-DNg@mail.gmail.com
Whole thread Raw
In response to Re: R: Index on integer or on string field  (Arthur Silva <arthurprs@gmail.com>)
List pgsql-general
Hello Francesco,

You should probably set timing on, run an explain analyze, and use pgbadger to diagnose your performance issue.

While it may be the case that comparison in the index might be slightly faster because of the modulo arithmetic, those in-memory operations are extremely fast and it is likely that the seek in that index is the fastest part of your query. And since you only have 50 distinct values the btree is probably extremely shallow and there will be very few comparisons anyway.

I don't know much about your query but I suspect that the issue is that your index scan is not selective enough so Postgres needs to scan a lot from disk (which is extremely slow). If you want to improve the performance you should first try to make the query as selective as possible, and try to put an index on a more selective column of the WHERE clause.

If you really must rely primarily on that column which has only 50 distinct values you can try periodically running a CLUSTER command on the table for that column index (doc: http://www.postgresql.org/docs/devel/static/sql-cluster.html), or partition the table (doc: http://www.postgresql.org/docs/devel/static/ddl-partitioning.html) so that the data you are scanning is close together on disk and you can get as much of it per IO operation as possible.

Will J. Dunn

On Fri, May 15, 2015 at 1:32 PM, Arthur Silva <arthurprs@gmail.com> wrote:

Yes that's my suggestion. Btree-Gin deals with lots of repeated values much better than the Btree index as repeated keys are only stored once.

Em 15/05/2015 12:38, "Job" <Job@colliniconsulting.it> escreveu:
Hello Arthur!

So, i read that btree-gin have got "the ability to enforce uniqueness".
 
If in this 10.millions long table i have, in index, 50 recurring values, i can leave the alphabetical field and change to btree-gin the index on it?!

Thank you!
Francesco
 

Da: Arthur Silva [arthurprs@gmail.com]
Inviato: venerdì 15 maggio 2015 17.26
A: Job
Cc: pgsql-general@postgresql.org
Oggetto: Re: [GENERAL] Index on integer or on string field

You should probably experiment with a btree-gin index on those.

Em 15/05/2015 12:22, "Job" <Job@colliniconsulting.it> escreveu:
Hello,

i have a table of about 10 millions of records, with the index on a string field.
Actually is alphabetical; since queries are about 100/200 per seconds, i was looking for a better way to improve performance and reduce workload.

The unique values, of that fields, are about the 50 (category name), and we could create a second table to codify, with numerical integer values, the 50 recurring names.

Is index are integer and not characteral, performance are better and workload reduces?

Is there any comparisons?

Thank you!
Francesco

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

pgsql-general by date:

Previous
From: Daniel Begin
Date:
Subject: Re: Restarting DB after moving to another drive
Next
From: Scott Marlowe
Date:
Subject: Re: Index on integer or on string field