Home > mailing lists

Re: Query taking long time - Mailing list pgsql-performance

From	Venkata Balaji Nagothi
Subject	Re: Query taking long time
Date	March 4, 2014 02:28:27
Msg-id	CAHBAh5s8_Q9hnHbNgEhukS1qGuz-Gzc0NZgKfj3B0vZYx6cUWw@mail.gmail.com Whole thread Raw
In response to	Re: Query taking long time ("acanada" <acanada@cnio.es>)
Responses	Re: Query taking long time ("acanada" <acanada@cnio.es>) Re: Query taking long time ("acanada" <acanada@cnio.es>)
List	pgsql-performance

Tree view

On Mon, Mar 3, 2014 at 9:17 PM, acanada <acanada@cnio.es> wrote:

Hello,

Thankyou for your answer.
I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in order to avoid the join clause. Now the schema is very simple. The query only implies one table:

x=> \d+ entity_compounddict2document;
Table "public.entity_compounddict2document"
Column | Type | Modifiers | Storage | Description
------------------+--------------------------------+-----------+----------+-------------
id | integer | not null | plain |
document_id | integer | | plain |
name | character varying(255) | | extended |
qualifier | character varying(255) | | extended |
tagMethod | character varying(255) | | extended |
created | timestamp(0) without time zone | | plain |
updated | timestamp(0) without time zone | | plain |
curation | integer | | plain |
hepval | double precision | | plain |
cardval | double precision | | plain |
nephval | double precision | | plain |
phosval | double precision | | plain |
patternCount | double precision | | plain |
ruleScore | double precision | | plain |
hepTermNormScore | double precision | | plain |
hepTermVarScore | double precision | | plain |
Indexes:
"entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
"entity_compound2document_cardval" btree (cardval)
"entity_compound2document_heptermnormscore" btree ("hepTermNormScore")
"entity_compound2document_heptermvarscore" btree ("hepTermVarScore")
"entity_compound2document_hepval" btree (hepval)
"entity_compound2document_name" btree (name)
"entity_compound2document_nephval" btree (nephval)
"entity_compound2document_patterncount" btree ("patternCount")
"entity_compound2document_phosval" btree (phosval)
"entity_compound2document_rulescore" btree ("ruleScore")
Has OIDs: no

tablename | indexname | num_rows | table_size | index_size | unique | number_of_scans | tuples_read | tuples_fetched
entity_compounddict2document | entity_compound2document_cardval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_heptermnormscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_heptermvarscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_hepval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_name | 5.42452e+07 | 6763 MB | 1505 MB | Y | 24 | 178680 | 0
entity_compounddict2document | entity_compound2document_nephval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_patterncount | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_phosval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_rulescore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compounddict2document_pkey | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0

The table has aprox. 54,000,000 rows
There are no NULLs in hepval field and pg_settings haven't changed. I also have done "analyze" to this table.

I have simplified the query and added the last advise that you told me:

Query:

explain analyze select * from (select * from entity_compounddict2document where name='ranitidine') as a order by a.hepval;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=11060.50..11067.55 rows=2822 width=133) (actual time=32715.097..32716.488 rows=13512 loops=1)
Sort Key: entity_compounddict2document.hepval
Sort Method: quicksort Memory: 2301kB
-> Bitmap Heap Scan on entity_compounddict2document (cost=73.82..10898.76 rows=2822 width=133) (actual time=6.034..32695.483 rows=13512 loops=1)
Recheck Cond: ((name)::text = 'ranitidine'::text)
-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..73.12 rows=2822 width=0) (actual time=3.221..3.221 rows=13512 loops=1)
Index Cond: ((name)::text = 'ranitidine'::text)
Total runtime: 32717.548 ms

Another query:

explain analyze select * from (select * from entity_compounddict2document where name='progesterone' ) as a order by a.hepval;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=367879.25..368209.24 rows=131997 width=133) (actual time=9262.887..9287.046 rows=138165 loops=1)
Sort Key: entity_compounddict2document.hepval
Sort Method: quicksort Memory: 25622kB
-> Bitmap Heap Scan on entity_compounddict2document (cost=2906.93..356652.81 rows=131997 width=133) (actual time=76.316..9038.485 rows=138165 loops=1)
Recheck Cond: ((name)::text = 'progesterone'::text)
-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..2873.93 rows=131997 width=0) (actual time=40.913..40.913 rows=138165 loops=1)
Index Cond: ((name)::text = 'progesterone'::text)
Total runtime: 9296.815 ms

It has improved (I supose because of the lack of the join table) but still taking a lot of time... Anything I can do??

Any help would be very appreciated. Thank you very much.

Good to know performance has increased.

"entity_compounddict2document" table goes through high INSERTS ?

Can you help us know if the "helpval" column and "name" column have high duplicate values ? "n_distinct" value from pg_stats table would have that info.

Below could be a possible workaround -

As mentioned earlier in this email, a composite Index on name and hepval column might help. If the table does not go through lot of INSERTS, then consider performing a CLUSTER on the table using the same INDEX.

Other recommendations -

Please drop off all the Non-primary key Indexes which have 0 scans / hits. This would harm the DB and the DB server whilst maintenance and DML operations.

Regards,

Venkata Balaji N

Fujitsu Australia

pgsql-performance by date:

From: Claudio Freire
Date: 04 March 2014, 00:52:13
Subject: Re: Help with optimizing a query over hierarchical data

From: Damon Snyder
Date: 04 March 2014, 04:12:28
Subject: Re: Help with optimizing a query over hierarchical data

Re: Query taking long time - Mailing list pgsql-performance

Previous

Next