Thread: Query taking long time

Query taking long time

From

"acanada"

Date:

26 February 2014, 15:41:41

Hello to everybody and thanks in advance to take a look to this message.
I'm new in this list and with PostgreSQL.
My queries are taking too much time to complete and I don't know what to do right now. I think I'm providing all  the
inforequired for you to help me. If you need extra info please tell me. 

I am using DQL included in the last version of symfony2 (2.4.2). This is the query, formed by DQL, but coppied-pasted
tothe psql client (9.1.11, server 8.3.8) 

explain analyze SELECT e0_.id AS id0, e0_.name AS name1, e0_.qualifier AS qualifier2, e0_."tagMethod" AS tagmethod3,
e0_.curationAS curation4, e0_.created AS created5, e0_.updated AS updated6, d1_.id AS id7, d1_.kind AS kind8, d1_.uid
ASuid9, d1_."sentenceId" AS sentenceid10, d1_.text AS text11, d1_.hepval AS hepval12, d1_.cardval AS cardval13,
d1_.nephvalAS nephval14, d1_.phosval AS phosval15, d1_."patternCount" AS patterncount16, d1_."ruleScore" AS
rulescore17,d1_."hepTermNormScore" AS heptermnormscore18, d1_."hepTermVarScore" AS heptermvarscore19, d1_.created AS
created20,d1_.updated AS updated21, e0_.document_id AS document_id22 FROM Entity2Document e0_ INNER JOIN documentold
d1_ON e0_.document_id = d1_.id WHERE e0_.name ='ranitidine' AND e0_.qualifier = 'CompoundDict' AND d1_.hepval IS NOT
NULLORDER BY d1_.hepval DESC limit 10; 


limtox=> \d+ documentold;
                               Table "public.documentold"
      Column      |              Type              | Modifiers | Storage  | Description
------------------+--------------------------------+-----------+----------+-------------
 id               | integer                        | not null  | plain    |
 kind             | character varying(255)         | not null  | extended |
 uid              | character varying(255)         | not null  | extended |
 sentenceId       | character varying(255)         | not null  | extended |
 text             | text                           | not null  | extended |
 hepval           | double precision               |           | plain    |
 created          | timestamp(0) without time zone | not null  | plain    |
 updated          | timestamp(0) without time zone |           | plain    |
 cardval          | double precision               |           | plain    |
 nephval          | double precision               |           | plain    |
 phosval          | double precision               |           | plain    |
 patternCount     | double precision               |           | plain    |
 ruleScore        | double precision               |           | plain    |
 hepTermNormScore | double precision               |           | plain    |
 hepTermVarScore  | double precision               |           | plain    |
Indexes:
    "DocumentOLD_pkey" PRIMARY KEY, btree (id)
    "document_cardval_index" btree (cardval)
    "document_heptermnorm_index" btree ("hepTermNormScore" DESC NULLS LAST)
    "document_heptermvar_index" btree ("hepTermVarScore" DESC NULLS LAST)
    "document_hepval_index" btree (hepval DESC NULLS LAST)
    "document_kind_index" btree (kind)
    "document_nephval_index" btree (nephval DESC NULLS LAST)
    "document_patterncount_index" btree ("patternCount" DESC NULLS LAST)
    "document_phosval_index" btree (phosval DESC NULLS LAST)
    "document_rulescore_index" btree ("ruleScore" DESC NULLS LAST)
    "document_sentenceid_index" btree ("sentenceId")
    "document_uid_index" btree (uid)
Referenced by:
    TABLE "hepkeywordtermnorm2document" CONSTRAINT "fk_1c19bcd0c33f7837" FOREIGN KEY (document_id) REFERENCES
documentold(id)
    TABLE "cytochrome2document" CONSTRAINT "fk_21f7636fc33f7837" FOREIGN KEY (document_id) REFERENCES documentold(id)
    TABLE "hepkeywordtermvariant2document" CONSTRAINT "fk_a316e36bc33f7837" FOREIGN KEY (document_id) REFERENCES
documentold(id)
    TABLE "entity2document" CONSTRAINT "fk_a6020c0dc33f7837" FOREIGN KEY (document_id) REFERENCES documentold(id)
    TABLE "specie2document" CONSTRAINT "fk_b6e551c8c33f7837" FOREIGN KEY (document_id) REFERENCES documentold(id)
Has OIDs: no




limtox=> \d+ entity2document;                                     Table "public.entity2document"   Column    |
   Type              |            Modifiers            | Storage  | Description
-------------+--------------------------------+---------------------------------+----------+-------------id          |
integer                       | not null                        | plain    |  
 document_id | integer                        |                                 | plain    |
 name        | character varying(255)         | not null                        | extended |
 qualifier   | character varying(255)         | not null                        | extended |
 tagMethod   | character varying(255)         | default NULL::character varying | extended |
 created     | timestamp(0) without time zone | not null                        | plain    |
 updated     | timestamp(0) without time zone |                                 | plain    |
 curation    | integer                        |                                 | plain    |
Indexes:
    "Entity2Document_pkey" PRIMARY KEY, btree (id)
    "entity2Document_name_index" btree (name)
    "entity2document_name_qualifier_index" btree (name, qualifier)
    "idx_a6020c0dc33f7837" btree (document_id)
    "qualifier_index" btree (qualifier)
Foreign-key constraints:
    "fk_a6020c0dc33f7837" FOREIGN KEY (document_id) REFERENCES documentold(id)
Has OIDs: no






Table metadata:
    documentold: 124.515.592 of rows. It has several columns with a large proportion of NULLs(updated, patternCount,
ruleScore,hepTermNormScore, hepTermVarScore) 
    entity2document: 93.785.968 of rows. It has two columns with a large proportion of NULLs (updated, curation)

None of the tables receive updates or deletes regularly


                                                                                QUERY PLAN
                                                  

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=387929.02..387929.05 rows=10 width=313) (actual time=55980.472..55980.476 rows=10 loops=1)
   ->  Sort  (cost=387929.02..387966.75 rows=15090 width=313) (actual time=55980.471..55980.473 rows=10 loops=1)
         Sort Key: d1_.hepval
         Sort Method:  top-N heapsort  Memory: 28kB
         ->  Nested Loop  (cost=469.14..387602.93 rows=15090 width=313) (actual time=96.716..55974.004 rows=2774
loops=1)
               ->  Bitmap Heap Scan on entity2document e0_  (cost=469.14..54851.25 rows=15090 width=59) (actual
time=51.299..8452.592rows=2774 loops=1) 
                     Recheck Cond: (((name)::text = 'Cimetidine'::text) AND ((qualifier)::text = 'CompoundDict'::text))
                     ->  Bitmap Index Scan on entity2document_name_qualifier_index  (cost=0.00..465.36 rows=15090
width=0)(actual time=36.467..36.467 rows=2774 loops=1) 
                           Index Cond: (((name)::text = 'Cimetidine'::text) AND ((qualifier)::text =
'CompoundDict'::text))
               ->  Index Scan using "DocumentOLD_pkey" on documentold d1_  (cost=0.00..22.04 rows=1 width=254) (actual
time=17.113..17.129rows=1 loops=2774) 
                     Index Cond: (d1_.id = e0_.document_id)
                     Filter: (d1_.hepval IS NOT NULL)
 Total runtime: 55980.554 ms
(13 rows)

 version
-----------------------------------------------------------------------------------------------------
 PostgreSQL 8.3.8 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.4.real (Ubuntu 10.4.1-3ubuntu3) 10.4.1

 This query has been always slow. It's fast only when it's cached. Vacuum and analyze have been done manually very
recently




 SELECT name, current_setting(name), source
  FROM pg_settings
  WHERE source NOT IN ('default', 'override');

             name            |  current_setting   |        source
----------------------------+--------------------+----------------------
 client_encoding            | UTF8               | client
 DateStyle                  | ISO, DMY           | configuration file
 default_text_search_config | pg_catalog.spanish | configuration file
 effective_cache_size       | 7500MB             | configuration file
 lc_messages                | es_ES.UTF-8        | configuration file
 lc_monetary                | es_ES.UTF-8        | configuration file
 lc_numeric                 | C                  | configuration file
 lc_time                    | es_ES.UTF-8        | configuration file
 listen_addresses           | *                  | configuration file
 log_line_prefix            | %t                 | configuration file
 log_timezone               | localtime          | command line
 maintenance_work_mem       | 2000MB             | configuration file
 max_connections            | 100                | configuration file
 max_fsm_pages              | 63217760           | configuration file
 max_stack_depth            | 2MB                | environment variable
 port                       | 5432               | configuration file
 shared_buffers             | 1500MB             | configuration file
 ssl                        | on                 | configuration file
 tcp_keepalives_count       | 9                  | configuration file
 tcp_keepalives_idle        | 7200               | configuration file
 tcp_keepalives_interval    | 75                 | configuration file
 TimeZone                   | localtime          | command line
 timezone_abbreviations     | Default            | command line
 work_mem                   | 50MB               | configuration file

 Setting the work_mem to 3000MB doesn't change anything...

 Everything seems good to me but the Recheck Cond, because of the large ammount of rows, is slowing the query too much.
Ihave read that is not a good point to try to get rid of recheck cond (maybe even not possible, I don't know, I'm new
toPostgreSQL). I'd like to know what I am doing wrong and how can I solve it... 

 Any help please?

 Thank you very much,

 Andrés
**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información
protegidapara el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de
transmisiónpor parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega
comunicarloal remitente y borrar el mensaje recibido. 
**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged
informationfor the sole use of the designated recipient named above. Distribution, reproduction or any other use of
thistransmission by any party other than the intended recipient is prohibited. If you are not the intended recipient
pleasecontact the sender and delete all copies.

Re: Query taking long time

From

"acanada"

Date:

27 February 2014, 10:31:37

Hello,

I have changed the multicolumn index from:
    "entity2document_name_qualifier_index" btree (name, qualifier)
 to:
     "document_qualifier_name_index" btree (qualifier, name)

And now the planner doesn't "Recheck cond:"  (there are only three different qualifiers vs. millions of names...)

But still taking long time


Limit  (cost=384043.64..384043.66 rows=10 width=313) (actual time=80555.930..80555.934 rows=10 loops=1)
   ->  Sort  (cost=384043.64..384081.19 rows=15020 width=313) (actual time=80555.928..80555.931 rows=10 loops=1)
         Sort Key: d1_.hepval
         Sort Method:  top-N heapsort  Memory: 29kB
         ->  Nested Loop  (cost=0.00..383719.06 rows=15020 width=313) (actual time=223.778..80547.196 rows=3170
loops=1)
               ->  Index Scan using document_qualifier_name_index on entity2document e0_  (cost=0.00..52505.40
rows=15020width=59) (actual time=126.880..11549.392 rows=3170 loops=1) 
                     Index Cond: (((qualifier)::text = 'CompoundDict'::text) AND ((name)::text =
'galactosamine'::text))
               ->  Index Scan using "DocumentOLD_pkey" on documentold d1_  (cost=0.00..22.04 rows=1 width=254) (actual
time=21.747..21.764rows=1 loops=3170) 
                     Index Cond: (d1_.id = e0_.document_id)
                     Filter: (d1_.hepval IS NOT NULL)
 Total runtime: 80556.027 ms



Any help/point to any direction, would be very appreciated.
Thank you,
Andrés

El Feb 26, 2014, a las 4:41 PM, acanada escribió:

> Hello to everybody and thanks in advance to take a look to this message.
> I'm new in this list and with PostgreSQL.
> My queries are taking too much time to complete and I don't know what to do right now. I think I'm providing all  the
inforequired for you to help me. If you need extra info please tell me. 
>
> I am using DQL included in the last version of symfony2 (2.4.2). This is the query, formed by DQL, but coppied-pasted
tothe psql client (9.1.11, server 8.3.8) 
>
> explain analyze SELECT e0_.id AS id0, e0_.name AS name1, e0_.qualifier AS qualifier2, e0_."tagMethod" AS tagmethod3,
e0_.curationAS curation4, e0_.created AS created5, e0_.updated AS updated6, d1_.id AS id7, d1_.kind AS kind8, d1_.uid
ASuid9, d1_."sentenceId" AS sentenceid10, d1_.text AS text11, d1_.hepval AS hepval12, d1_.cardval AS cardval13,
d1_.nephvalAS nephval14, d1_.phosval AS phosval15, d1_."patternCount" AS patterncount16, d1_."ruleScore" AS
rulescore17,d1_."hepTermNormScore" AS heptermnormscore18, d1_."hepTermVarScore" AS heptermvarscore19, d1_.created AS
created20,d1_.updated AS updated21, e0_.document_id AS document_id22 FROM Entity2Document e0_ INNER JOIN documentold
d1_ON e0_.document_id = d1_.id WHERE e0_.name ='ranitidine' AND e0_.qualifier = 'CompoundDict' AND d1_.hepval IS NOT
NULLORDER BY d1_.hepval DESC limit 10; 
>
>
> limtox=> \d+ documentold;
>                               Table "public.documentold"
>      Column      |              Type              | Modifiers | Storage  | Description
> ------------------+--------------------------------+-----------+----------+-------------
> id               | integer                        | not null  | plain    |
> kind             | character varying(255)         | not null  | extended |
> uid              | character varying(255)         | not null  | extended |
> sentenceId       | character varying(255)         | not null  | extended |
> text             | text                           | not null  | extended |
> hepval           | double precision               |           | plain    |
> created          | timestamp(0) without time zone | not null  | plain    |
> updated          | timestamp(0) without time zone |           | plain    |
> cardval          | double precision               |           | plain    |
> nephval          | double precision               |           | plain    |
> phosval          | double precision               |           | plain    |
> patternCount     | double precision               |           | plain    |
> ruleScore        | double precision               |           | plain    |
> hepTermNormScore | double precision               |           | plain    |
> hepTermVarScore  | double precision               |           | plain    |
> Indexes:
>    "DocumentOLD_pkey" PRIMARY KEY, btree (id)
>    "document_cardval_index" btree (cardval)
>    "document_heptermnorm_index" btree ("hepTermNormScore" DESC NULLS LAST)
>    "document_heptermvar_index" btree ("hepTermVarScore" DESC NULLS LAST)
>    "document_hepval_index" btree (hepval DESC NULLS LAST)
>    "document_kind_index" btree (kind)
>    "document_nephval_index" btree (nephval DESC NULLS LAST)
>    "document_patterncount_index" btree ("patternCount" DESC NULLS LAST)
>    "document_phosval_index" btree (phosval DESC NULLS LAST)
>    "document_rulescore_index" btree ("ruleScore" DESC NULLS LAST)
>    "document_sentenceid_index" btree ("sentenceId")
>    "document_uid_index" btree (uid)
> Referenced by:
>    TABLE "hepkeywordtermnorm2document" CONSTRAINT "fk_1c19bcd0c33f7837" FOREIGN KEY (document_id) REFERENCES
documentold(id)
>    TABLE "cytochrome2document" CONSTRAINT "fk_21f7636fc33f7837" FOREIGN KEY (document_id) REFERENCES documentold(id)
>    TABLE "hepkeywordtermvariant2document" CONSTRAINT "fk_a316e36bc33f7837" FOREIGN KEY (document_id) REFERENCES
documentold(id)
>    TABLE "entity2document" CONSTRAINT "fk_a6020c0dc33f7837" FOREIGN KEY (document_id) REFERENCES documentold(id)
>    TABLE "specie2document" CONSTRAINT "fk_b6e551c8c33f7837" FOREIGN KEY (document_id) REFERENCES documentold(id)
> Has OIDs: no
>
>
>
>
> limtox=> \d+ entity2document;                                     Table "public.entity2document"   Column    |
     Type              |            Modifiers            | Storage  | Description
-------------+--------------------------------+---------------------------------+----------+-------------id          |
integer                       | not null                        | plain    |  
> document_id | integer                        |                                 | plain    |
> name        | character varying(255)         | not null                        | extended |
> qualifier   | character varying(255)         | not null                        | extended |
> tagMethod   | character varying(255)         | default NULL::character varying | extended |
> created     | timestamp(0) without time zone | not null                        | plain    |
> updated     | timestamp(0) without time zone |                                 | plain    |
> curation    | integer                        |                                 | plain    |
> Indexes:
>    "Entity2Document_pkey" PRIMARY KEY, btree (id)
>    "entity2Document_name_index" btree (name)
>    "entity2document_name_qualifier_index" btree (name, qualifier)
>    "idx_a6020c0dc33f7837" btree (document_id)
>    "qualifier_index" btree (qualifier)
> Foreign-key constraints:
>    "fk_a6020c0dc33f7837" FOREIGN KEY (document_id) REFERENCES documentold(id)
> Has OIDs: no
>
>
>
>
>
>
> Table metadata:
>    documentold: 124.515.592 of rows. It has several columns with a large proportion of NULLs(updated, patternCount,
ruleScore,hepTermNormScore, hepTermVarScore) 
>    entity2document: 93.785.968 of rows. It has two columns with a large proportion of NULLs (updated, curation)
>
> None of the tables receive updates or deletes regularly
>
>
>                                                                                QUERY PLAN
                                                   
>
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit  (cost=387929.02..387929.05 rows=10 width=313) (actual time=55980.472..55980.476 rows=10 loops=1)
>   ->  Sort  (cost=387929.02..387966.75 rows=15090 width=313) (actual time=55980.471..55980.473 rows=10 loops=1)
>         Sort Key: d1_.hepval
>         Sort Method:  top-N heapsort  Memory: 28kB
>         ->  Nested Loop  (cost=469.14..387602.93 rows=15090 width=313) (actual time=96.716..55974.004 rows=2774
loops=1)
>               ->  Bitmap Heap Scan on entity2document e0_  (cost=469.14..54851.25 rows=15090 width=59) (actual
time=51.299..8452.592rows=2774 loops=1) 
>                     Recheck Cond: (((name)::text = 'Cimetidine'::text) AND ((qualifier)::text =
'CompoundDict'::text))
>                     ->  Bitmap Index Scan on entity2document_name_qualifier_index  (cost=0.00..465.36 rows=15090
width=0)(actual time=36.467..36.467 rows=2774 loops=1) 
>                           Index Cond: (((name)::text = 'Cimetidine'::text) AND ((qualifier)::text =
'CompoundDict'::text))
>               ->  Index Scan using "DocumentOLD_pkey" on documentold d1_  (cost=0.00..22.04 rows=1 width=254) (actual
time=17.113..17.129rows=1 loops=2774) 
>                     Index Cond: (d1_.id = e0_.document_id)
>                     Filter: (d1_.hepval IS NOT NULL)
> Total runtime: 55980.554 ms
> (13 rows)
>
> version
> -----------------------------------------------------------------------------------------------------
> PostgreSQL 8.3.8 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.4.real (Ubuntu 10.4.1-3ubuntu3) 10.4.1
>
> This query has been always slow. It's fast only when it's cached. Vacuum and analyze have been done manually very
recently
>
>
>
>
> SELECT name, current_setting(name), source
>  FROM pg_settings
>  WHERE source NOT IN ('default', 'override');
>
>             name            |  current_setting   |        source
> ----------------------------+--------------------+----------------------
> client_encoding            | UTF8               | client
> DateStyle                  | ISO, DMY           | configuration file
> default_text_search_config | pg_catalog.spanish | configuration file
> effective_cache_size       | 7500MB             | configuration file
> lc_messages                | es_ES.UTF-8        | configuration file
> lc_monetary                | es_ES.UTF-8        | configuration file
> lc_numeric                 | C                  | configuration file
> lc_time                    | es_ES.UTF-8        | configuration file
> listen_addresses           | *                  | configuration file
> log_line_prefix            | %t                 | configuration file
> log_timezone               | localtime          | command line
> maintenance_work_mem       | 2000MB             | configuration file
> max_connections            | 100                | configuration file
> max_fsm_pages              | 63217760           | configuration file
> max_stack_depth            | 2MB                | environment variable
> port                       | 5432               | configuration file
> shared_buffers             | 1500MB             | configuration file
> ssl                        | on                 | configuration file
> tcp_keepalives_count       | 9                  | configuration file
> tcp_keepalives_idle        | 7200               | configuration file
> tcp_keepalives_interval    | 75                 | configuration file
> TimeZone                   | localtime          | command line
> timezone_abbreviations     | Default            | command line
> work_mem                   | 50MB               | configuration file
>
> Setting the work_mem to 3000MB doesn't change anything...
>
> Everything seems good to me but the Recheck Cond, because of the large ammount of rows, is slowing the query too
much.I have read that is not a good point to try to get rid of recheck cond (maybe even not possible, I don't know, I'm
newto PostgreSQL). I'd like to know what I am doing wrong and how can I solve it... 
>
> Any help please?
>
> Thank you very much,
>
> Andrés
> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información
protegidapara el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de
transmisiónpor parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega
comunicarloal remitente y borrar el mensaje recibido. 
> **CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged
informationfor the sole use of the designated recipient named above. Distribution, reproduction or any other use of
thistransmission by any party other than the intended recipient is prohibited. If you are not the intended recipient
pleasecontact the sender and delete all copies. 
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información
protegidapara el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de
transmisiónpor parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega
comunicarloal remitente y borrar el mensaje recibido. 
**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged
informationfor the sole use of the designated recipient named above. Distribution, reproduction or any other use of
thistransmission by any party other than the intended recipient is prohibited. If you are not the intended recipient
pleasecontact the sender and delete all copies.

Re: Query taking long time

From

"acanada"

Date:

03 March 2014, 10:21:25

Hello,

Thankyou for your answer.

I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in order to avoid the join clause. Now the schema is very simple. The query only implies one table:

x=> \d+ entity_compounddict2document;

Table "public.entity_compounddict2document"

Column | Type | Modifiers | Storage | Description

------------------+--------------------------------+-----------+----------+-------------

id | integer | not null | plain |

document_id | integer | | plain |

name | character varying(255) | | extended |

qualifier | character varying(255) | | extended |

tagMethod | character varying(255) | | extended |

created | timestamp(0) without time zone | | plain |

updated | timestamp(0) without time zone | | plain |

curation | integer | | plain |

hepval | double precision | | plain |

cardval | double precision | | plain |

nephval | double precision | | plain |

phosval | double precision | | plain |

patternCount | double precision | | plain |

ruleScore | double precision | | plain |

hepTermNormScore | double precision | | plain |

hepTermVarScore | double precision | | plain |

Indexes:

"entity_compounddict2document_pkey" PRIMARY KEY, btree (id)

"entity_compound2document_cardval" btree (cardval)

"entity_compound2document_heptermnormscore" btree ("hepTermNormScore")

"entity_compound2document_heptermvarscore" btree ("hepTermVarScore")

"entity_compound2document_hepval" btree (hepval)

"entity_compound2document_name" btree (name)

"entity_compound2document_nephval" btree (nephval)

"entity_compound2document_patterncount" btree ("patternCount")

"entity_compound2document_phosval" btree (phosval)

"entity_compound2document_rulescore" btree ("ruleScore")

Has OIDs: no

entity_compounddict2document | entity_compound2document_cardval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0

entity_compounddict2document | entity_compound2document_heptermnormscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0

entity_compounddict2document | entity_compound2document_heptermvarscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0

entity_compounddict2document | entity_compound2document_hepval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0

entity_compounddict2document | entity_compound2document_name | 5.42452e+07 | 6763 MB | 1505 MB | Y | 24 | 178680 | 0

entity_compounddict2document | entity_compound2document_nephval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0

entity_compounddict2document | entity_compound2document_patterncount | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0

entity_compounddict2document | entity_compound2document_phosval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0

entity_compounddict2document | entity_compound2document_rulescore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0

entity_compounddict2document | entity_compounddict2document_pkey | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0

The table has aprox. 54,000,000 rows

There are no NULLs in hepval field and pg_settings haven't changed. I also have done "analyze" to this table.

I have simplified the query and added the last advise that you told me:

Query:

explain analyze select * from (select * from entity_compounddict2document where name='ranitidine') as a order by a.hepval;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------

Sort (cost=11060.50..11067.55 rows=2822 width=133) (actual time=32715.097..32716.488 rows=13512 loops=1)

Sort Key: entity_compounddict2document.hepval

Sort Method: quicksort Memory: 2301kB

-> Bitmap Heap Scan on entity_compounddict2document (cost=73.82..10898.76 rows=2822 width=133) (actual time=6.034..32695.483 rows=13512 loops=1)

Recheck Cond: ((name)::text = 'ranitidine'::text)

-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..73.12 rows=2822 width=0) (actual time=3.221..3.221 rows=13512 loops=1)

Index Cond: ((name)::text = 'ranitidine'::text)

Total runtime: 32717.548 ms

Another query:

explain analyze select * from (select * from entity_compounddict2document where name='progesterone' ) as a order by a.hepval;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------

Sort (cost=367879.25..368209.24 rows=131997 width=133) (actual time=9262.887..9287.046 rows=138165 loops=1)

Sort Key: entity_compounddict2document.hepval

Sort Method: quicksort Memory: 25622kB

-> Bitmap Heap Scan on entity_compounddict2document (cost=2906.93..356652.81 rows=131997 width=133) (actual time=76.316..9038.485 rows=138165 loops=1)

Recheck Cond: ((name)::text = 'progesterone'::text)

-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..2873.93 rows=131997 width=0) (actual time=40.913..40.913 rows=138165 loops=1)

Index Cond: ((name)::text = 'progesterone'::text)

Total runtime: 9296.815 ms

It has improved (I supose because of the lack of the join table) but still taking a lot of time... Anything I can do??

Any help would be very appreciated. Thank you very much

Andrés.

El Mar 3, 2014, a las 1:04 AM, Venkata Balaji Nagothi escribió:

Any Re-Indexing was done recently ?

If the SELECT query without ORDER BY is showing low cost, then, the query can be re-written as below to see if the performance improves. If the resultant rows of the query are

select * from (select query without order by clause) a order by a.hepval and so on -- something like that.

This should lower the cost of the query because the planner chooses to sort on the resultant set of the rows rather than sorting the table and getting the results.

Please let us know if this helps !

Venkata Balaji N

Sr. Database Administrator
Fujitsu Australia

On Fri, Feb 28, 2014 at 8:55 PM, acanada <acanada@cnio.es> wrote:
Thankyou for your answer!

Sizes of Tables and Indexes are:

relname | rows_in_bytes | num_rows | number_of_indexes | unique | single_column | multi_column
--------------------------------+---------------+-------------+-------------------+--------+---------------+--------------
documentold | 119 MB | 1.24516e+08 | 12 | Y | 12 | 0
entity2document | 89 MB | 9.33666e+07 | 5 | Y | 4 | 1

tablename | indexname | num_rows | table_size | index_size | unique | number_of_scans | tuples_read | tuples_fetched
--------------------------------+------------------------------------------------+-------------+------------+------------+--------+-----------------+-------------+----------------
documentold | DocumentOLD_pkey | 1.24516e+08 | 33 GB | 2708 MB | Y | 45812802 | 924462741 | 924084958
documentold | document_cardval_index | 1.24516e+08 | 33 GB | 2763 MB | Y | 0 | 0 | 0
documentold | document_heptermnorm_index | 1.24516e+08 | 33 GB | 2667 MB | Y | 0 | 0 | 0
documentold | document_heptermvar_index | 1.24516e+08 | 33 GB | 2667 MB | Y | 0 | 0 | 0
documentold | document_hepval_index | 1.24516e+08 | 33 GB | 2667 MB | Y | 0 | 0 | 0
documentold | document_kind_index | 1.24516e+08 | 33 GB | 2859 MB | Y | 0 | 0 | 0
documentold | document_nephval_index | 1.24516e+08 | 33 GB | 2667 MB | Y | 0 | 0 | 0
documentold | document_patterncount_index | 1.24516e+08 | 33 GB | 2667 MB | Y | 0 | 0 | 0
documentold | document_phosval_index | 1.24516e+08 | 33 GB | 2667 MB | Y | 0 | 0 | 0
documentold | document_rulescore_index | 1.24516e+08 | 33 GB | 2667 MB | Y | 0 | 0 | 0
documentold | document_sentenceid_index | 1.24516e+08 | 33 GB | 3867 MB | Y | 8089466 | 12669585 | 7597923
documentold | document_uid_index | 1.24516e+08 | 33 GB | 3889 MB | Y | 0 | 0 | 0
entity2document | Entity2Document_pkey | 9.33666e+07 | 7216 MB | 2000 MB | Y | 2942 | 2942 | 2942
entity2document | document_qualifier_name_index | 9.33666e+07 | 7216 MB | 3557 MB | Y | 93 | 1091680 | 124525
entity2document | entity2Document_name_index | 9.33666e+07 | 7216 MB | 2550 MB | Y | 4330 | 3320634 | 2
entity2document | idx_a6020c0dc33f7837 | 9.33666e+07 | 7216 MB | 2000 MB | Y | 2465927 | 1661666 | 1661666
entity2document | qualifier_index | 9.33666e+07 | 7216 MB | 2469 MB | Y | 51 | 2333120186 | 0

The explain plan shows lower cost without order by!!
There are no NULLs in the hepval field...

Thank you for your time!

Andrés

El Feb 28, 2014, a las 2:28 AM, Venkata Balaji Nagothi escribió:

Hi Andres,

Can you please help us with the below information.

- Sizes of Tables and Indexes
- The Explain plan shows same/higher cost without ORDER BY clause ?

I suspect huge number of NULLs might be the problem. If you can please get us the above information, then we can probably know if the cost is genuine.

Venkata Balaji N

Sr. Database Administrator
Fujitsu Australia

On Thu, Feb 27, 2014 at 9:31 PM, acanada <acanada@cnio.es> wrote:
Hello,

I have changed the multicolumn index from:
"entity2document_name_qualifier_index" btree (name, qualifier)
to:
"document_qualifier_name_index" btree (qualifier, name)

And now the planner doesn't "Recheck cond:" (there are only three different qualifiers vs. millions of names...)

But still taking long time

Limit (cost=384043.64..384043.66 rows=10 width=313) (actual time=80555.930..80555.934 rows=10 loops=1)
-> Sort (cost=384043.64..384081.19 rows=15020 width=313) (actual time=80555.928..80555.931 rows=10 loops=1)
Sort Key: d1_.hepval
Sort Method: top-N heapsort Memory: 29kB
-> Nested Loop (cost=0.00..383719.06 rows=15020 width=313) (actual time=223.778..80547.196 rows=3170 loops=1)
-> Index Scan using document_qualifier_name_index on entity2document e0_ (cost=0.00..52505.40 rows=15020 width=59) (actual time=126.880..11549.392 rows=3170 loops=1)
Index Cond: (((qualifier)::text = 'CompoundDict'::text) AND ((name)::text = 'galactosamine'::text))
-> Index Scan using "DocumentOLD_pkey" on documentold d1_ (cost=0.00..22.04 rows=1 width=254) (actual time=21.747..21.764 rows=1 loops=3170)
Index Cond: (d1_.id = e0_.document_id)
Filter: (d1_.hepval IS NOT NULL)
Total runtime: 80556.027 ms

Any help/point to any direction, would be very appreciated.
Thank you,
Andrés

El Feb 26, 2014, a las 4:41 PM, acanada escribió:

> Hello to everybody and thanks in advance to take a look to this message.
> I'm new in this list and with PostgreSQL.
> My queries are taking too much time to complete and I don't know what to do right now. I think I'm providing all the info required for you to help me. If you need extra info please tell me.
>
> I am using DQL included in the last version of symfony2 (2.4.2). This is the query, formed by DQL, but coppied-pasted to the psql client (9.1.11, server 8.3.8)
>
> explain analyze SELECT e0_.id AS id0, e0_.name AS name1, e0_.qualifier AS qualifier2, e0_."tagMethod" AS tagmethod3, e0_.curation AS curation4, e0_.created AS created5, e0_.updated AS updated6, d1_.id AS id7, d1_.kind AS kind8, d1_.uid AS uid9, d1_."sentenceId" AS sentenceid10, d1_.text AS text11, d1_.hepval AS hepval12, d1_.cardval AS cardval13, d1_.nephval AS nephval14, d1_.phosval AS phosval15, d1_."patternCount" AS patterncount16, d1_."ruleScore" AS rulescore17, d1_."hepTermNormScore" AS heptermnormscore18, d1_."hepTermVarScore" AS heptermvarscore19, d1_.created AS created20, d1_.updated AS updated21, e0_.document_id AS document_id22 FROM Entity2Document e0_ INNER JOIN documentold d1_ ON e0_.document_id = d1_.id WHERE e0_.name ='ranitidine' AND e0_.qualifier = 'CompoundDict' AND d1_.hepval IS NOT NULL ORDER BY d1_.hepval DESC limit 10;
>
>
> limtox=> \d+ documentold;
> Table "public.documentold"
> Column | Type | Modifiers | Storage | Description
> ------------------+--------------------------------+-----------+----------+-------------
> id | integer | not null | plain |
> kind | character varying(255) | not null | extended |
> uid | character varying(255) | not null | extended |
> sentenceId | character varying(255) | not null | extended |
> text | text | not null | extended |
> hepval | double precision | | plain |
> created | timestamp(0) without time zone | not null | plain |
> updated | timestamp(0) without time zone | | plain |
> cardval | double precision | | plain |
> nephval | double precision | | plain |
> phosval | double precision | | plain |
> patternCount | double precision | | plain |
> ruleScore | double precision | | plain |
> hepTermNormScore | double precision | | plain |
> hepTermVarScore | double precision | | plain |
> Indexes:
> "DocumentOLD_pkey" PRIMARY KEY, btree (id)
> "document_cardval_index" btree (cardval)
> "document_heptermnorm_index" btree ("hepTermNormScore" DESC NULLS LAST)
> "document_heptermvar_index" btree ("hepTermVarScore" DESC NULLS LAST)
> "document_hepval_index" btree (hepval DESC NULLS LAST)
> "document_kind_index" btree (kind)
> "document_nephval_index" btree (nephval DESC NULLS LAST)
> "document_patterncount_index" btree ("patternCount" DESC NULLS LAST)
> "document_phosval_index" btree (phosval DESC NULLS LAST)
> "document_rulescore_index" btree ("ruleScore" DESC NULLS LAST)
> "document_sentenceid_index" btree ("sentenceId")
> "document_uid_index" btree (uid)
> Referenced by:
> TABLE "hepkeywordtermnorm2document" CONSTRAINT "fk_1c19bcd0c33f7837" FOREIGN KEY (document_id) REFERENCES documentold(id)
> TABLE "cytochrome2document" CONSTRAINT "fk_21f7636fc33f7837" FOREIGN KEY (document_id) REFERENCES documentold(id)
> TABLE "hepkeywordtermvariant2document" CONSTRAINT "fk_a316e36bc33f7837" FOREIGN KEY (document_id) REFERENCES documentold(id)
> TABLE "entity2document" CONSTRAINT "fk_a6020c0dc33f7837" FOREIGN KEY (document_id) REFERENCES documentold(id)
> TABLE "specie2document" CONSTRAINT "fk_b6e551c8c33f7837" FOREIGN KEY (document_id) REFERENCES documentold(id)
> Has OIDs: no
>
>
>
>
> limtox=> \d+ entity2document; Table "public.entity2document" Column | Type | Modifiers | Storage | Description -------------+--------------------------------+---------------------------------+----------+------------- id | integer | not null | plain |
> document_id | integer | | plain |
> name | character varying(255) | not null | extended |
> qualifier | character varying(255) | not null | extended |
> tagMethod | character varying(255) | default NULL::character varying | extended |
> created | timestamp(0) without time zone | not null | plain |
> updated | timestamp(0) without time zone | | plain |
> curation | integer | | plain |
> Indexes:
> "Entity2Document_pkey" PRIMARY KEY, btree (id)
> "entity2Document_name_index" btree (name)
> "entity2document_name_qualifier_index" btree (name, qualifier)
> "idx_a6020c0dc33f7837" btree (document_id)
> "qualifier_index" btree (qualifier)
> Foreign-key constraints:
> "fk_a6020c0dc33f7837" FOREIGN KEY (document_id) REFERENCES documentold(id)
> Has OIDs: no
>
>
>
>
>
>
> Table metadata:
> documentold: 124.515.592 of rows. It has several columns with a large proportion of NULLs(updated, patternCount, ruleScore, hepTermNormScore, hepTermVarScore)
> entity2document: 93.785.968 of rows. It has two columns with a large proportion of NULLs (updated, curation)
>
> None of the tables receive updates or deletes regularly
>
>
> QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=387929.02..387929.05 rows=10 width=313) (actual time=55980.472..55980.476 rows=10 loops=1)
> -> Sort (cost=387929.02..387966.75 rows=15090 width=313) (actual time=55980.471..55980.473 rows=10 loops=1)
> Sort Key: d1_.hepval
> Sort Method: top-N heapsort Memory: 28kB
> -> Nested Loop (cost=469.14..387602.93 rows=15090 width=313) (actual time=96.716..55974.004 rows=2774 loops=1)
> -> Bitmap Heap Scan on entity2document e0_ (cost=469.14..54851.25 rows=15090 width=59) (actual time=51.299..8452.592 rows=2774 loops=1)
> Recheck Cond: (((name)::text = 'Cimetidine'::text) AND ((qualifier)::text = 'CompoundDict'::text))
> -> Bitmap Index Scan on entity2document_name_qualifier_index (cost=0.00..465.36 rows=15090 width=0) (actual time=36.467..36.467 rows=2774 loops=1)
> Index Cond: (((name)::text = 'Cimetidine'::text) AND ((qualifier)::text = 'CompoundDict'::text))
> -> Index Scan using "DocumentOLD_pkey" on documentold d1_ (cost=0.00..22.04 rows=1 width=254) (actual time=17.113..17.129 rows=1 loops=2774)
> Index Cond: (d1_.id = e0_.document_id)
> Filter: (d1_.hepval IS NOT NULL)
> Total runtime: 55980.554 ms
> (13 rows)
>
> version
> -----------------------------------------------------------------------------------------------------
> PostgreSQL 8.3.8 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.4.real (Ubuntu 10.4.1-3ubuntu3) 10.4.1
>
> This query has been always slow. It's fast only when it's cached. Vacuum and analyze have been done manually very recently
>
>
>
>
> SELECT name, current_setting(name), source
> FROM pg_settings
> WHERE source NOT IN ('default', 'override');
>
> name | current_setting | source
> ----------------------------+--------------------+----------------------
> client_encoding | UTF8 | client
> DateStyle | ISO, DMY | configuration file
> default_text_search_config | pg_catalog.spanish | configuration file
> effective_cache_size | 7500MB | configuration file
> lc_messages | es_ES.UTF-8 | configuration file
> lc_monetary | es_ES.UTF-8 | configuration file
> lc_numeric | C | configuration file
> lc_time | es_ES.UTF-8 | configuration file
> listen_addresses | * | configuration file
> log_line_prefix | %t | configuration file
> log_timezone | localtime | command line
> maintenance_work_mem | 2000MB | configuration file
> max_connections | 100 | configuration file
> max_fsm_pages | 63217760 | configuration file
> max_stack_depth | 2MB | environment variable
> port | 5432 | configuration file
> shared_buffers | 1500MB | configuration file
> ssl | on | configuration file
> tcp_keepalives_count | 9 | configuration file
> tcp_keepalives_idle | 7200 | configuration file
> tcp_keepalives_interval | 75 | configuration file
> TimeZone | localtime | command line
> timezone_abbreviations | Default | command line
> work_mem | 50MB | configuration file
>
> Setting the work_mem to 3000MB doesn't change anything...
>
> Everything seems good to me but the Recheck Cond, because of the large ammount of rows, is slowing the query too much. I have read that is not a good point to try to get rid of recheck cond (maybe even not possible, I don't know, I'm new to PostgreSQL). I'd like to know what I am doing wrong and how can I solve it...
>
> Any help please?
>
> Thank you very much,
>
> Andrés
> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.
> **CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.
**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.

**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.

**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.

**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.

Re: Query taking long time

From

Vladimir Sitnikov

Date:

03 March 2014, 17:18:00

> I have simplified the query and added the last advise that you told me:
>
> Query:
>
> explain analyze select * from (select * from entity_compounddict2document where name='ranitidine') as a order by a.hepval;
>
Do you need full result?

If you need just top-n rows, then index on entity_compounddict2document(name, a.hepval) might help.

Regards,
Vladimir Sitnikov

Re: Query taking long time

From

Venkata Balaji Nagothi

Date:

03 March 2014, 23:28:27

On Mon, Mar 3, 2014 at 9:17 PM, acanada <acanada@cnio.es> wrote:

Hello,

Thankyou for your answer.
I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in order to avoid the join clause. Now the schema is very simple. The query only implies one table:

x=> \d+ entity_compounddict2document;
Table "public.entity_compounddict2document"
Column | Type | Modifiers | Storage | Description
------------------+--------------------------------+-----------+----------+-------------
id | integer | not null | plain |
document_id | integer | | plain |
name | character varying(255) | | extended |
qualifier | character varying(255) | | extended |
tagMethod | character varying(255) | | extended |
created | timestamp(0) without time zone | | plain |
updated | timestamp(0) without time zone | | plain |
curation | integer | | plain |
hepval | double precision | | plain |
cardval | double precision | | plain |
nephval | double precision | | plain |
phosval | double precision | | plain |
patternCount | double precision | | plain |
ruleScore | double precision | | plain |
hepTermNormScore | double precision | | plain |
hepTermVarScore | double precision | | plain |
Indexes:
"entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
"entity_compound2document_cardval" btree (cardval)
"entity_compound2document_heptermnormscore" btree ("hepTermNormScore")
"entity_compound2document_heptermvarscore" btree ("hepTermVarScore")
"entity_compound2document_hepval" btree (hepval)
"entity_compound2document_name" btree (name)
"entity_compound2document_nephval" btree (nephval)
"entity_compound2document_patterncount" btree ("patternCount")
"entity_compound2document_phosval" btree (phosval)
"entity_compound2document_rulescore" btree ("ruleScore")
Has OIDs: no

tablename | indexname | num_rows | table_size | index_size | unique | number_of_scans | tuples_read | tuples_fetched
entity_compounddict2document | entity_compound2document_cardval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_heptermnormscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_heptermvarscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_hepval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_name | 5.42452e+07 | 6763 MB | 1505 MB | Y | 24 | 178680 | 0
entity_compounddict2document | entity_compound2document_nephval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_patterncount | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_phosval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_rulescore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compounddict2document_pkey | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0

The table has aprox. 54,000,000 rows
There are no NULLs in hepval field and pg_settings haven't changed. I also have done "analyze" to this table.

I have simplified the query and added the last advise that you told me:

Query:

explain analyze select * from (select * from entity_compounddict2document where name='ranitidine') as a order by a.hepval;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=11060.50..11067.55 rows=2822 width=133) (actual time=32715.097..32716.488 rows=13512 loops=1)
Sort Key: entity_compounddict2document.hepval
Sort Method: quicksort Memory: 2301kB
-> Bitmap Heap Scan on entity_compounddict2document (cost=73.82..10898.76 rows=2822 width=133) (actual time=6.034..32695.483 rows=13512 loops=1)
Recheck Cond: ((name)::text = 'ranitidine'::text)
-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..73.12 rows=2822 width=0) (actual time=3.221..3.221 rows=13512 loops=1)
Index Cond: ((name)::text = 'ranitidine'::text)
Total runtime: 32717.548 ms

Another query:

explain analyze select * from (select * from entity_compounddict2document where name='progesterone' ) as a order by a.hepval;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=367879.25..368209.24 rows=131997 width=133) (actual time=9262.887..9287.046 rows=138165 loops=1)
Sort Key: entity_compounddict2document.hepval
Sort Method: quicksort Memory: 25622kB
-> Bitmap Heap Scan on entity_compounddict2document (cost=2906.93..356652.81 rows=131997 width=133) (actual time=76.316..9038.485 rows=138165 loops=1)
Recheck Cond: ((name)::text = 'progesterone'::text)
-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..2873.93 rows=131997 width=0) (actual time=40.913..40.913 rows=138165 loops=1)
Index Cond: ((name)::text = 'progesterone'::text)
Total runtime: 9296.815 ms

It has improved (I supose because of the lack of the join table) but still taking a lot of time... Anything I can do??

Any help would be very appreciated. Thank you very much.

Good to know performance has increased.

"entity_compounddict2document" table goes through high INSERTS ?

Can you help us know if the "helpval" column and "name" column have high duplicate values ? "n_distinct" value from pg_stats table would have that info.

Below could be a possible workaround -

As mentioned earlier in this email, a composite Index on name and hepval column might help. If the table does not go through lot of INSERTS, then consider performing a CLUSTER on the table using the same INDEX.

Other recommendations -

Please drop off all the Non-primary key Indexes which have 0 scans / hits. This would harm the DB and the DB server whilst maintenance and DML operations.

Regards,

Venkata Balaji N

Fujitsu Australia

Re: Query taking long time

From

"acanada"

Date:

04 March 2014, 10:10:47

Hello!

The table doesn't go through high inserts so I'm taking into account your "CLUSTER" advise. Thanks.

I'm afraid that I cannot drop the indexes that don't have scans hits because they will have scans and hits very soon

Duplicated values for this table are:

tablename | attname | n_distinct

entity_compounddict2document | name | 16635

entity_compounddict2document | hepval | 2.04444e+06

Thank you very much for your help!!

Andrés

El Mar 4, 2014, a las 12:28 AM, Venkata Balaji Nagothi escribió:

On Mon, Mar 3, 2014 at 9:17 PM, acanada <acanada@cnio.es> wrote:
Hello,

Thankyou for your answer.
I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in order to avoid the join clause. Now the schema is very simple. The query only implies one table:

x=> \d+ entity_compounddict2document;
Table "public.entity_compounddict2document"
Column | Type | Modifiers | Storage | Description
------------------+--------------------------------+-----------+----------+-------------
id | integer | not null | plain |
document_id | integer | | plain |
name | character varying(255) | | extended |
qualifier | character varying(255) | | extended |
tagMethod | character varying(255) | | extended |
created | timestamp(0) without time zone | | plain |
updated | timestamp(0) without time zone | | plain |
curation | integer | | plain |
hepval | double precision | | plain |
cardval | double precision | | plain |
nephval | double precision | | plain |
phosval | double precision | | plain |
patternCount | double precision | | plain |
ruleScore | double precision | | plain |
hepTermNormScore | double precision | | plain |
hepTermVarScore | double precision | | plain |
Indexes:
"entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
"entity_compound2document_cardval" btree (cardval)
"entity_compound2document_heptermnormscore" btree ("hepTermNormScore")
"entity_compound2document_heptermvarscore" btree ("hepTermVarScore")
"entity_compound2document_hepval" btree (hepval)
"entity_compound2document_name" btree (name)
"entity_compound2document_nephval" btree (nephval)
"entity_compound2document_patterncount" btree ("patternCount")
"entity_compound2document_phosval" btree (phosval)
"entity_compound2document_rulescore" btree ("ruleScore")
Has OIDs: no

tablename | indexname | num_rows | table_size | index_size | unique | number_of_scans | tuples_read | tuples_fetched
entity_compounddict2document | entity_compound2document_cardval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_heptermnormscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_heptermvarscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_hepval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_name | 5.42452e+07 | 6763 MB | 1505 MB | Y | 24 | 178680 | 0
entity_compounddict2document | entity_compound2document_nephval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_patterncount | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_phosval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_rulescore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compounddict2document_pkey | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0

The table has aprox. 54,000,000 rows
There are no NULLs in hepval field and pg_settings haven't changed. I also have done "analyze" to this table.

I have simplified the query and added the last advise that you told me:

Query:

explain analyze select * from (select * from entity_compounddict2document where name='ranitidine') as a order by a.hepval;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=11060.50..11067.55 rows=2822 width=133) (actual time=32715.097..32716.488 rows=13512 loops=1)
Sort Key: entity_compounddict2document.hepval
Sort Method: quicksort Memory: 2301kB
-> Bitmap Heap Scan on entity_compounddict2document (cost=73.82..10898.76 rows=2822 width=133) (actual time=6.034..32695.483 rows=13512 loops=1)
Recheck Cond: ((name)::text = 'ranitidine'::text)
-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..73.12 rows=2822 width=0) (actual time=3.221..3.221 rows=13512 loops=1)
Index Cond: ((name)::text = 'ranitidine'::text)
Total runtime: 32717.548 ms

Another query:

explain analyze select * from (select * from entity_compounddict2document where name='progesterone' ) as a order by a.hepval;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=367879.25..368209.24 rows=131997 width=133) (actual time=9262.887..9287.046 rows=138165 loops=1)
Sort Key: entity_compounddict2document.hepval
Sort Method: quicksort Memory: 25622kB
-> Bitmap Heap Scan on entity_compounddict2document (cost=2906.93..356652.81 rows=131997 width=133) (actual time=76.316..9038.485 rows=138165 loops=1)
Recheck Cond: ((name)::text = 'progesterone'::text)
-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..2873.93 rows=131997 width=0) (actual time=40.913..40.913 rows=138165 loops=1)
Index Cond: ((name)::text = 'progesterone'::text)
Total runtime: 9296.815 ms

It has improved (I supose because of the lack of the join table) but still taking a lot of time... Anything I can do??

Any help would be very appreciated. Thank you very much.

Good to know performance has increased.

"entity_compounddict2document" table goes through high INSERTS ?

Can you help us know if the "helpval" column and "name" column have high duplicate values ? "n_distinct" value from pg_stats table would have that info.

Below could be a possible workaround -

As mentioned earlier in this email, a composite Index on name and hepval column might help. If the table does not go through lot of INSERTS, then consider performing a CLUSTER on the table using the same INDEX.

Other recommendations -

Please drop off all the Non-primary key Indexes which have 0 scans / hits. This would harm the DB and the DB server whilst maintenance and DML operations.

Regards,
Venkata Balaji N

Fujitsu Australia

Re: Query taking long time

From

"acanada"

Date:

04 March 2014, 11:24:00

Hello,

I don't know if this helps to figure out what is the problem but after adding the multicolumn index on name and hepval, the performance is even worse (¿?). Ten times worse...

explain analyze select * from (select * from entity_compounddict2document where name='progesterone') as a order by a.hepval;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------

Sort (cost=422746.18..423143.94 rows=159104 width=133) (actual time=95769.674..95797.943 rows=138165 loops=1)

Sort Key: entity_compounddict2document.hepval

Sort Method: quicksort Memory: 25622kB

-> Bitmap Heap Scan on entity_compounddict2document (cost=3501.01..408999.90 rows=159104 width=133) (actual time=70.789..95519.258 rows=138165 loops=1)

Recheck Cond: ((name)::text = 'progesterone'::text)

-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..3461.23 rows=159104 width=0) (actual time=35.174..35.174 rows=138165 loops=1)

Index Cond: ((name)::text = 'progesterone'::text)

Total runtime: 95811.838 ms

(8 rows)

Any ideas please?

Thank you

Andrés.

El Mar 4, 2014, a las 12:28 AM, Venkata Balaji Nagothi escribió:

On Mon, Mar 3, 2014 at 9:17 PM, acanada <acanada@cnio.es> wrote:
Hello,

Thankyou for your answer.
I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in order to avoid the join clause. Now the schema is very simple. The query only implies one table:

x=> \d+ entity_compounddict2document;
Table "public.entity_compounddict2document"
Column | Type | Modifiers | Storage | Description
------------------+--------------------------------+-----------+----------+-------------
id | integer | not null | plain |
document_id | integer | | plain |
name | character varying(255) | | extended |
qualifier | character varying(255) | | extended |
tagMethod | character varying(255) | | extended |
created | timestamp(0) without time zone | | plain |
updated | timestamp(0) without time zone | | plain |
curation | integer | | plain |
hepval | double precision | | plain |
cardval | double precision | | plain |
nephval | double precision | | plain |
phosval | double precision | | plain |
patternCount | double precision | | plain |
ruleScore | double precision | | plain |
hepTermNormScore | double precision | | plain |
hepTermVarScore | double precision | | plain |
Indexes:
"entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
"entity_compound2document_cardval" btree (cardval)
"entity_compound2document_heptermnormscore" btree ("hepTermNormScore")
"entity_compound2document_heptermvarscore" btree ("hepTermVarScore")
"entity_compound2document_hepval" btree (hepval)
"entity_compound2document_name" btree (name)
"entity_compound2document_nephval" btree (nephval)
"entity_compound2document_patterncount" btree ("patternCount")
"entity_compound2document_phosval" btree (phosval)
"entity_compound2document_rulescore" btree ("ruleScore")
Has OIDs: no

tablename | indexname | num_rows | table_size | index_size | unique | number_of_scans | tuples_read | tuples_fetched
entity_compounddict2document | entity_compound2document_cardval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_heptermnormscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_heptermvarscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_hepval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_name | 5.42452e+07 | 6763 MB | 1505 MB | Y | 24 | 178680 | 0
entity_compounddict2document | entity_compound2document_nephval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_patterncount | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_phosval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_rulescore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compounddict2document_pkey | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0

The table has aprox. 54,000,000 rows
There are no NULLs in hepval field and pg_settings haven't changed. I also have done "analyze" to this table.

I have simplified the query and added the last advise that you told me:

Query:

explain analyze select * from (select * from entity_compounddict2document where name='ranitidine') as a order by a.hepval;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=11060.50..11067.55 rows=2822 width=133) (actual time=32715.097..32716.488 rows=13512 loops=1)
Sort Key: entity_compounddict2document.hepval
Sort Method: quicksort Memory: 2301kB
-> Bitmap Heap Scan on entity_compounddict2document (cost=73.82..10898.76 rows=2822 width=133) (actual time=6.034..32695.483 rows=13512 loops=1)
Recheck Cond: ((name)::text = 'ranitidine'::text)
-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..73.12 rows=2822 width=0) (actual time=3.221..3.221 rows=13512 loops=1)
Index Cond: ((name)::text = 'ranitidine'::text)
Total runtime: 32717.548 ms

Another query:

explain analyze select * from (select * from entity_compounddict2document where name='progesterone' ) as a order by a.hepval;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=367879.25..368209.24 rows=131997 width=133) (actual time=9262.887..9287.046 rows=138165 loops=1)
Sort Key: entity_compounddict2document.hepval
Sort Method: quicksort Memory: 25622kB
-> Bitmap Heap Scan on entity_compounddict2document (cost=2906.93..356652.81 rows=131997 width=133) (actual time=76.316..9038.485 rows=138165 loops=1)
Recheck Cond: ((name)::text = 'progesterone'::text)
-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..2873.93 rows=131997 width=0) (actual time=40.913..40.913 rows=138165 loops=1)
Index Cond: ((name)::text = 'progesterone'::text)
Total runtime: 9296.815 ms

It has improved (I supose because of the lack of the join table) but still taking a lot of time... Anything I can do??

Any help would be very appreciated. Thank you very much.

Good to know performance has increased.

"entity_compounddict2document" table goes through high INSERTS ?

Can you help us know if the "helpval" column and "name" column have high duplicate values ? "n_distinct" value from pg_stats table would have that info.

Below could be a possible workaround -

As mentioned earlier in this email, a composite Index on name and hepval column might help. If the table does not go through lot of INSERTS, then consider performing a CLUSTER on the table using the same INDEX.

Other recommendations -

Please drop off all the Non-primary key Indexes which have 0 scans / hits. This would harm the DB and the DB server whilst maintenance and DML operations.

Regards,
Venkata Balaji N

Fujitsu Australia

Re: Query taking long time

From

Venkata Balaji Nagothi

Date:

04 March 2014, 23:35:57

After looking at the distinct values, yes the composite Index on "name" and "hepval" is not recommended. That would worsen - its expected.

We need to look for other possible work around. Please drop off the above Index. Let me see if i can drill further into this.

Meanwhile - can you help us know the memory parameters (work_mem, temp_buffers etc) set ?

Do you have any other processes effecting this query's performance ?

Any info about your Disk, RAM, CPU would also help.

Regards,

Venkata Balaji N

Fujitsu Australia

Venkata Balaji N

Sr. Database Administrator

Fujitsu Australia

On Tue, Mar 4, 2014 at 10:23 PM, acanada <acanada@cnio.es> wrote:

Hello,

I don't know if this helps to figure out what is the problem but after adding the multicolumn index on name and hepval, the performance is even worse (¿?). Ten times worse...

explain analyze select * from (select * from entity_compounddict2document where name='progesterone') as a order by a.hepval;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=422746.18..423143.94 rows=159104 width=133) (actual time=95769.674..95797.943 rows=138165 loops=1)
Sort Key: entity_compounddict2document.hepval
Sort Method: quicksort Memory: 25622kB
-> Bitmap Heap Scan on entity_compounddict2document (cost=3501.01..408999.90 rows=159104 width=133) (actual time=70.789..95519.258 rows=138165 loops=1)
Recheck Cond: ((name)::text = 'progesterone'::text)
-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..3461.23 rows=159104 width=0) (actual time=35.174..35.174 rows=138165 loops=1)
Index Cond: ((name)::text = 'progesterone'::text)
Total runtime: 95811.838 ms
(8 rows)

Any ideas please?

Thank you
Andrés.

El Mar 4, 2014, a las 12:28 AM, Venkata Balaji Nagothi escribió:

On Mon, Mar 3, 2014 at 9:17 PM, acanada <acanada@cnio.es> wrote:
Hello,

Thankyou for your answer.
I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in order to avoid the join clause. Now the schema is very simple. The query only implies one table:

x=> \d+ entity_compounddict2document;
Table "public.entity_compounddict2document"
Column | Type | Modifiers | Storage | Description
------------------+--------------------------------+-----------+----------+-------------
id | integer | not null | plain |
document_id | integer | | plain |
name | character varying(255) | | extended |
qualifier | character varying(255) | | extended |
tagMethod | character varying(255) | | extended |
created | timestamp(0) without time zone | | plain |
updated | timestamp(0) without time zone | | plain |
curation | integer | | plain |
hepval | double precision | | plain |
cardval | double precision | | plain |
nephval | double precision | | plain |
phosval | double precision | | plain |
patternCount | double precision | | plain |
ruleScore | double precision | | plain |
hepTermNormScore | double precision | | plain |
hepTermVarScore | double precision | | plain |
Indexes:
"entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
"entity_compound2document_cardval" btree (cardval)
"entity_compound2document_heptermnormscore" btree ("hepTermNormScore")
"entity_compound2document_heptermvarscore" btree ("hepTermVarScore")
"entity_compound2document_hepval" btree (hepval)
"entity_compound2document_name" btree (name)
"entity_compound2document_nephval" btree (nephval)
"entity_compound2document_patterncount" btree ("patternCount")
"entity_compound2document_phosval" btree (phosval)
"entity_compound2document_rulescore" btree ("ruleScore")
Has OIDs: no

tablename | indexname | num_rows | table_size | index_size | unique | number_of_scans | tuples_read | tuples_fetched
entity_compounddict2document | entity_compound2document_cardval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_heptermnormscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_heptermvarscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_hepval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_name | 5.42452e+07 | 6763 MB | 1505 MB | Y | 24 | 178680 | 0
entity_compounddict2document | entity_compound2document_nephval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_patterncount | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_phosval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_rulescore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compounddict2document_pkey | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0

The table has aprox. 54,000,000 rows
There are no NULLs in hepval field and pg_settings haven't changed. I also have done "analyze" to this table.

I have simplified the query and added the last advise that you told me:

Query:

explain analyze select * from (select * from entity_compounddict2document where name='ranitidine') as a order by a.hepval;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=11060.50..11067.55 rows=2822 width=133) (actual time=32715.097..32716.488 rows=13512 loops=1)
Sort Key: entity_compounddict2document.hepval
Sort Method: quicksort Memory: 2301kB
-> Bitmap Heap Scan on entity_compounddict2document (cost=73.82..10898.76 rows=2822 width=133) (actual time=6.034..32695.483 rows=13512 loops=1)
Recheck Cond: ((name)::text = 'ranitidine'::text)
-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..73.12 rows=2822 width=0) (actual time=3.221..3.221 rows=13512 loops=1)
Index Cond: ((name)::text = 'ranitidine'::text)
Total runtime: 32717.548 ms

Another query:

explain analyze select * from (select * from entity_compounddict2document where name='progesterone' ) as a order by a.hepval;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=367879.25..368209.24 rows=131997 width=133) (actual time=9262.887..9287.046 rows=138165 loops=1)
Sort Key: entity_compounddict2document.hepval
Sort Method: quicksort Memory: 25622kB
-> Bitmap Heap Scan on entity_compounddict2document (cost=2906.93..356652.81 rows=131997 width=133) (actual time=76.316..9038.485 rows=138165 loops=1)
Recheck Cond: ((name)::text = 'progesterone'::text)
-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..2873.93 rows=131997 width=0) (actual time=40.913..40.913 rows=138165 loops=1)
Index Cond: ((name)::text = 'progesterone'::text)
Total runtime: 9296.815 ms

It has improved (I supose because of the lack of the join table) but still taking a lot of time... Anything I can do??

Any help would be very appreciated. Thank you very much.

Good to know performance has increased.

"entity_compounddict2document" table goes through high INSERTS ?

Can you help us know if the "helpval" column and "name" column have high duplicate values ? "n_distinct" value from pg_stats table would have that info.

Below could be a possible workaround -

As mentioned earlier in this email, a composite Index on name and hepval column might help. If the table does not go through lot of INSERTS, then consider performing a CLUSTER on the table using the same INDEX.

Other recommendations -

Please drop off all the Non-primary key Indexes which have 0 scans / hits. This would harm the DB and the DB server whilst maintenance and DML operations.

Regards,
Venkata Balaji N

Fujitsu Australia

**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.

**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.

Re: Query taking long time

From

"acanada"

Date:

06 March 2014, 10:47:59

Hi Venkata,

Increasing the work_mem doesn't improve results. After raising it to 1GB:

limtox=> explain analyze select * from entity_compounddict2document where name='Troglitazone' order by hepval;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------

Sort (cost=11083.47..11090.54 rows=2828 width=133) (actual time=19679.354..19679.494 rows=1283 loops=1)

Sort Key: hepval

Sort Method: quicksort Memory: 238kB

-> Bitmap Heap Scan on entity_compounddict2document (cost=73.87..10921.34 rows=2828 width=133) (actual time=93.926..19677.110 rows=1283 loops=1)

Recheck Cond: ((name)::text = 'Troglitazone'::text)

-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..73.16 rows=2828 width=0) (actual time=78.005..78.005 rows=1283 loops=1)

Index Cond: ((name)::text = 'Troglitazone'::text)

Total runtime: 19679.680 ms

There are not temp files in the data_directory... I have set to 1MB log_temp_files and did this query again but there's nothing related to tmp files in it...

I cannot see "temp_files" column in pg_stat_database view (using 8.3 version) :-(

Thanks for your help. Regards,

Andrés

El Mar 6, 2014, a las 2:36 AM, Venkata Balaji Nagothi escribió:

Hi Andres,

Sorting cost is high !!

This query must be going for a disk sort, do you see temp files getting created in pg_stat_tmp directory in the data directory ?

Or you can enable "log_temp_files" to probably 1 MB or so. This will log the information about temp files in the log file in pg_log directory.

Or you can see "temp_files" column in pg_stat_database view before and after the execution of the query.

How much work_mem did you give ? you can increase the work_mem to probably 1 GB at session level and run the query. That might have different results.

Dropping of other Indexes will not effect this query's performance. I recommended them to be dropped to avoid maintenance and performance over-head in future.

Venkata Balaji N

Sr. Database Administrator
Fujitsu Australia

On Wed, Mar 5, 2014 at 10:14 PM, acanada <acanada@cnio.es> wrote:
Hi,

This are the parameters of the server:

SELECT name, current_setting(name), source
FROM pg_settings
WHERE source NOT IN ('default', 'override');

name | current_setting | source
----------------------------+--------------------+----------------------
client_encoding | UTF8 | client
DateStyle | ISO, DMY | configuration file
default_statistics_target | 100 | configuration file
default_text_search_config | pg_catalog.spanish | configuration file
effective_cache_size | 7500MB | configuration file
lc_messages | es_ES.UTF-8 | configuration file
lc_monetary | es_ES.UTF-8 | configuration file
lc_numeric | C | configuration file
lc_time | es_ES.UTF-8 | configuration file
listen_addresses | * | configuration file
log_line_prefix | %t | configuration file
log_timezone | localtime | command line
maintenance_work_mem | 2000MB | configuration file
max_connections | 100 | configuration file
max_fsm_pages | 63217760 | configuration file
max_stack_depth | 2MB | environment variable
port | 5432 | configuration file
shared_buffers | 1500MB | configuration file
ssl | on | configuration file
tcp_keepalives_count | 9 | configuration file
tcp_keepalives_idle | 7200 | configuration file
tcp_keepalives_interval | 75 | configuration file
TimeZone | localtime | command line
timezone_abbreviations | Default | command line
work_mem | 50MB | configuration file

The server has 2 processors quadcore, 10GB of RAM and data is located in a fiber disk of 2TB.
Changing work_mem parameter seems to have no effect on the perform.

Now I have a curious situation. I created a new table, the one we are query against. This table, entity_compounddict2document has less rows aprox. 50M vs. the original table entity2document2 that has 94M rows.
Well, after dropping indexes not already in use, both tables have the same performance with this query...

explain analyze select * from entity_compounddict2document where name='Troglitazone' order by hepval;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=11083.47..11090.54 rows=2828 width=133) (actual time=19708.019..19708.136 rows=1283 loops=1)
Sort Key: hepval
Sort Method: quicksort Memory: 238kB
-> Bitmap Heap Scan on entity_compounddict2document (cost=73.87..10921.34 rows=2828 width=133) (actual time=44.292..19705.954 rows=1283 loops=1)
Recheck Cond: ((name)::text = 'Troglitazone'::text)
-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..73.16 rows=2828 width=0) (actual time=28.159..28.159 rows=1283 loops=1)
Index Cond: ((name)::text = 'Troglitazone'::text)
Total runtime: 19708.275 ms
(8 rows)

explain analyze select * from entity2document2 where name='Troglitazone' order by hepval;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=18237.75..18249.38 rows=4653 width=123) (actual time=18945.732..18945.869 rows=1283 loops=1)
Sort Key: hepval
Sort Method: quicksort Memory: 238kB
-> Bitmap Heap Scan on entity2document2 (cost=117.37..17954.29 rows=4653 width=123) (actual time=41.703..18943.720 rows=1283 loops=1)
Recheck Cond: ((name)::text = 'Troglitazone'::text)
-> Bitmap Index Scan on entity2document2_name (cost=0.00..116.20 rows=4653 width=0) (actual time=28.703..28.703 rows=1283 loops=1)
Index Cond: ((name)::text = 'Troglitazone'::text)
Total runtime: 18945.991 ms
(8 rows)

Description of the tables are:

limtox=> \d+ entity_compounddict2document;
Table "public.entity_compounddict2document"
Column | Type | Modifiers | Storage | Description
------------------+--------------------------------+-----------+----------+-------------
id | integer | not null | plain |
document_id | integer | | plain |
name | character varying(255) | | extended |
qualifier | character varying(255) | | extended |
tagMethod | character varying(255) | | extended |
created | timestamp(0) without time zone | | plain |
updated | timestamp(0) without time zone | | plain |
curation | integer | | plain |
hepval | double precision | | plain |
cardval | double precision | | plain |
nephval | double precision | | plain |
phosval | double precision | | plain |
patternCount | double precision | | plain |
ruleScore | double precision | | plain |
hepTermNormScore | double precision | | plain |
hepTermVarScore | double precision | | plain |
Indexes:
"entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
"entity_compound2document_name" btree (name)
"entity_compound2document_nephval" btree (nephval)
Has OIDs: no

limtox=> \d+ entity2document2;
Table "public.entity2document2"
Column | Type | Modifiers | Storage | Description
------------------+--------------------------------+-----------+----------+-------------
id | integer | not null | plain |
document_id | integer | | plain |
name | character varying(255) | | extended |
qualifier | character varying(255) | | extended |
tagMethod | character varying(255) | | extended |
created | timestamp(0) without time zone | | plain |
updated | timestamp(0) without time zone | | plain |
curation | integer | | plain |
hepval | double precision | | plain |
cardval | double precision | | plain |
nephval | double precision | | plain |
phosval | double precision | | plain |
patternCount | double precision | | plain |
ruleScore | double precision | | plain |
hepTermNormScore | double precision | | plain |
hepTermVarScore | double precision | | plain |
Indexes:
"entity2document2_pkey" PRIMARY KEY, btree (id)
"entity2document2_hepval_index" btree (hepval)
"entity2document2_name" btree (name)
"entity2document2_qualifier_name_hepval" btree (qualifier, name)
"entity2document_qualifier_index" btree (qualifier)
Has OIDs: no

I really appreciate your help!!

Regards,
Andrés

El Mar 5, 2014, a las 12:35 AM, Venkata Balaji Nagothi escribió:

After looking at the distinct values, yes the composite Index on "name" and "hepval" is not recommended. That would worsen - its expected.

We need to look for other possible work around. Please drop off the above Index. Let me see if i can drill further into this.

Meanwhile - can you help us know the memory parameters (work_mem, temp_buffers etc) set ?

Do you have any other processes effecting this query's performance ?

Any info about your Disk, RAM, CPU would also help.

Regards,
Venkata Balaji N

Fujitsu Australia

Venkata Balaji N

Sr. Database Administrator
Fujitsu Australia

On Tue, Mar 4, 2014 at 10:23 PM, acanada <acanada@cnio.es> wrote:
Hello,

I don't know if this helps to figure out what is the problem but after adding the multicolumn index on name and hepval, the performance is even worse (¿?). Ten times worse...

explain analyze select * from (select * from entity_compounddict2document where name='progesterone') as a order by a.hepval;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=422746.18..423143.94 rows=159104 width=133) (actual time=95769.674..95797.943 rows=138165 loops=1)
Sort Key: entity_compounddict2document.hepval
Sort Method: quicksort Memory: 25622kB
-> Bitmap Heap Scan on entity_compounddict2document (cost=3501.01..408999.90 rows=159104 width=133) (actual time=70.789..95519.258 rows=138165 loops=1)
Recheck Cond: ((name)::text = 'progesterone'::text)
-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..3461.23 rows=159104 width=0) (actual time=35.174..35.174 rows=138165 loops=1)
Index Cond: ((name)::text = 'progesterone'::text)
Total runtime: 95811.838 ms
(8 rows)

Any ideas please?

Thank you
Andrés.

El Mar 4, 2014, a las 12:28 AM, Venkata Balaji Nagothi escribió:

On Mon, Mar 3, 2014 at 9:17 PM, acanada <acanada@cnio.es> wrote:
Hello,

Thankyou for your answer.
I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in order to avoid the join clause. Now the schema is very simple. The query only implies one table:

x=> \d+ entity_compounddict2document;
Table "public.entity_compounddict2document"
Column | Type | Modifiers | Storage | Description
------------------+--------------------------------+-----------+----------+-------------
id | integer | not null | plain |
document_id | integer | | plain |
name | character varying(255) | | extended |
qualifier | character varying(255) | | extended |
tagMethod | character varying(255) | | extended |
created | timestamp(0) without time zone | | plain |
updated | timestamp(0) without time zone | | plain |
curation | integer | | plain |
hepval | double precision | | plain |
cardval | double precision | | plain |
nephval | double precision | | plain |
phosval | double precision | | plain |
patternCount | double precision | | plain |
ruleScore | double precision | | plain |
hepTermNormScore | double precision | | plain |
hepTermVarScore | double precision | | plain |
Indexes:
"entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
"entity_compound2document_cardval" btree (cardval)
"entity_compound2document_heptermnormscore" btree ("hepTermNormScore")
"entity_compound2document_heptermvarscore" btree ("hepTermVarScore")
"entity_compound2document_hepval" btree (hepval)
"entity_compound2document_name" btree (name)
"entity_compound2document_nephval" btree (nephval)
"entity_compound2document_patterncount" btree ("patternCount")
"entity_compound2document_phosval" btree (phosval)
"entity_compound2document_rulescore" btree ("ruleScore")
Has OIDs: no

tablename | indexname | num_rows | table_size | index_size | unique | number_of_scans | tuples_read | tuples_fetched
entity_compounddict2document | entity_compound2document_cardval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_heptermnormscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_heptermvarscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_hepval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_name | 5.42452e+07 | 6763 MB | 1505 MB | Y | 24 | 178680 | 0
entity_compounddict2document | entity_compound2document_nephval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_patterncount | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_phosval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_rulescore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compounddict2document_pkey | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0

The table has aprox. 54,000,000 rows
There are no NULLs in hepval field and pg_settings haven't changed. I also have done "analyze" to this table.

I have simplified the query and added the last advise that you told me:

Query:

explain analyze select * from (select * from entity_compounddict2document where name='ranitidine') as a order by a.hepval;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=11060.50..11067.55 rows=2822 width=133) (actual time=32715.097..32716.488 rows=13512 loops=1)
Sort Key: entity_compounddict2document.hepval
Sort Method: quicksort Memory: 2301kB
-> Bitmap Heap Scan on entity_compounddict2document (cost=73.82..10898.76 rows=2822 width=133) (actual time=6.034..32695.483 rows=13512 loops=1)
Recheck Cond: ((name)::text = 'ranitidine'::text)
-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..73.12 rows=2822 width=0) (actual time=3.221..3.221 rows=13512 loops=1)
Index Cond: ((name)::text = 'ranitidine'::text)
Total runtime: 32717.548 ms

Another query:

explain analyze select * from (select * from entity_compounddict2document where name='progesterone' ) as a order by a.hepval;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=367879.25..368209.24 rows=131997 width=133) (actual time=9262.887..9287.046 rows=138165 loops=1)
Sort Key: entity_compounddict2document.hepval
Sort Method: quicksort Memory: 25622kB
-> Bitmap Heap Scan on entity_compounddict2document (cost=2906.93..356652.81 rows=131997 width=133) (actual time=76.316..9038.485 rows=138165 loops=1)
Recheck Cond: ((name)::text = 'progesterone'::text)
-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..2873.93 rows=131997 width=0) (actual time=40.913..40.913 rows=138165 loops=1)
Index Cond: ((name)::text = 'progesterone'::text)
Total runtime: 9296.815 ms

It has improved (I supose because of the lack of the join table) but still taking a lot of time... Anything I can do??

Any help would be very appreciated. Thank you very much.

Good to know performance has increased.

"entity_compounddict2document" table goes through high INSERTS ?

Can you help us know if the "helpval" column and "name" column have high duplicate values ? "n_distinct" value from pg_stats table would have that info.

Below could be a possible workaround -

As mentioned earlier in this email, a composite Index on name and hepval column might help. If the table does not go through lot of INSERTS, then consider performing a CLUSTER on the table using the same INDEX.

Other recommendations -

Please drop off all the Non-primary key Indexes which have 0 scans / hits. This would harm the DB and the DB server whilst maintenance and DML operations.

Regards,
Venkata Balaji N

Fujitsu Australia

**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.

**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.

**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.

**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.

Re: Query taking long time

From

desmodemone

Date:

06 March 2014, 13:11:38

Il 05/mar/2014 00:36 "Venkata Balaji Nagothi" <vbnpgc@gmail.com> ha scritto:
>
> After looking at the distinct values, yes the composite Index on "name" and "hepval" is not recommended. That would worsen - its expected.
>
> We need to look for other possible work around. Please drop off the above Index. Let me see if i can drill further into this.
>
> Meanwhile - can you help us know the memory parameters (work_mem, temp_buffers etc) set ?
>
> Do you have any other processes effecting this query's performance ?
>
> Any info about your Disk, RAM, CPU would also help.
>
> Regards,
> Venkata Balaji N
>
> Fujitsu Australia
>
>
>
>
> Venkata Balaji N
>
> Sr. Database Administrator
> Fujitsu Australia
>
>
> On Tue, Mar 4, 2014 at 10:23 PM, acanada <acanada@cnio.es> wrote:
>>
>> Hello,
>>
>> I don't know if this helps to figure out what is the problem but after adding the multicolumn index on name and hepval, the performance is even worse (¿?). Ten times worse...
>>
>> explain analyze select * from (select * from entity_compounddict2document where name='progesterone') as a order by a.hepval;
>> QUERY PLAN
>> -------------------------------------------------------------------------------------------------------------------------------------------------------------
>> Sort (cost=422746.18..423143.94 rows=159104 width=133) (actual time=95769.674..95797.943 rows=138165 loops=1)
>> Sort Key: entity_compounddict2document.hepval
>> Sort Method: quicksort Memory: 25622kB
>> -> Bitmap Heap Scan on entity_compounddict2document (cost=3501.01..408999.90 rows=159104 width=133) (actual time=70.789..95519.258 rows=138165 loops=1)
>> Recheck Cond: ((name)::text = 'progesterone'::text)
>> -> Bitmap Index Scan on entity_compound2document_name (cost=0.00..3461.23 rows=159104 width=0) (actual time=35.174..35.174 rows=138165 loops=1)
>> Index Cond: ((name)::text = 'progesterone'::text)
>> Total runtime: 95811.838 ms
>> (8 rows)
>>
>> Any ideas please?
>>
>> Thank you
>> Andrés.
>>
>>
>>
>> El Mar 4, 2014, a las 12:28 AM, Venkata Balaji Nagothi escribió:
>>
>>> On Mon, Mar 3, 2014 at 9:17 PM, acanada <acanada@cnio.es> wrote:
>>>>
>>>> Hello,
>>>>
>>>> Thankyou for your answer.
>>>> I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in order to avoid the join clause. Now the schema is very simple. The query only implies one table:
>>>>
>>>> x=> \d+ entity_compounddict2document;
>>>> Table "public.entity_compounddict2document"
>>>> Column | Type | Modifiers | Storage | Description
>>>> ------------------+--------------------------------+-----------+----------+-------------
>>>> id | integer | not null | plain |
>>>> document_id | integer | | plain |
>>>> name | character varying(255) | | extended |
>>>> qualifier | character varying(255) | | extended |
>>>> tagMethod | character varying(255) | | extended |
>>>> created | timestamp(0) without time zone | | plain |
>>>> updated | timestamp(0) without time zone | | plain |
>>>> curation | integer | | plain |
>>>> hepval | double precision | | plain |
>>>> cardval | double precision | | plain |
>>>> nephval | double precision | | plain |
>>>> phosval | double precision | | plain |
>>>> patternCount | double precision | | plain |
>>>> ruleScore | double precision | | plain |
>>>> hepTermNormScore | double precision | | plain |
>>>> hepTermVarScore | double precision | | plain |
>>>> Indexes:
>>>> "entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
>>>> "entity_compound2document_cardval" btree (cardval)
>>>> "entity_compound2document_heptermnormscore" btree ("hepTermNormScore")
>>>> "entity_compound2document_heptermvarscore" btree ("hepTermVarScore")
>>>> "entity_compound2document_hepval" btree (hepval)
>>>> "entity_compound2document_name" btree (name)
>>>> "entity_compound2document_nephval" btree (nephval)
>>>> "entity_compound2document_patterncount" btree ("patternCount")
>>>> "entity_compound2document_phosval" btree (phosval)
>>>> "entity_compound2document_rulescore" btree ("ruleScore")
>>>> Has OIDs: no
>>>>
>>>> tablename | indexname | num_rows | table_size | index_size | unique | number_of_scans | tuples_read | tuples_fetched
>>>> entity_compounddict2document | entity_compound2document_cardval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_heptermnormscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_heptermvarscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_hepval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_name | 5.42452e+07 | 6763 MB | 1505 MB | Y | 24 | 178680 | 0
>>>> entity_compounddict2document | entity_compound2document_nephval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_patterncount | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_phosval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_rulescore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compounddict2document_pkey | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>>
>>>> The table has aprox. 54,000,000 rows
>>>> There are no NULLs in hepval field and pg_settings haven't changed. I also have done "analyze" to this table.
>>>>
>>>> I have simplified the query and added the last advise that you told me:
>>>>
>>>> Query:
>>>>
>>>> explain analyze select * from (select * from entity_compounddict2document where name='ranitidine') as a order by a.hepval;
>>>> QUERY PLAN
>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------
>>>> Sort (cost=11060.50..11067.55 rows=2822 width=133) (actual time=32715.097..32716.488 rows=13512 loops=1)
>>>> Sort Key: entity_compounddict2document.hepval
>>>> Sort Method: quicksort Memory: 2301kB
>>>> -> Bitmap Heap Scan on entity_compounddict2document (cost=73.82..10898.76 rows=2822 width=133) (actual time=6.034..32695.483 rows=13512 loops=1)
>>>> Recheck Cond: ((name)::text = 'ranitidine'::text)
>>>> -> Bitmap Index Scan on entity_compound2document_name (cost=0.00..73.12 rows=2822 width=0) (actual time=3.221..3.221 rows=13512 loops=1)
>>>> Index Cond: ((name)::text = 'ranitidine'::text)
>>>> Total runtime: 32717.548 ms
>>>>
>>>> Another query:
>>>> explain analyze select * from (select * from entity_compounddict2document where name='progesterone' ) as a order by a.hepval;
>>>>
>>>> QUERY PLAN
>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>> Sort (cost=367879.25..368209.24 rows=131997 width=133) (actual time=9262.887..9287.046 rows=138165 loops=1)
>>>> Sort Key: entity_compounddict2document.hepval
>>>> Sort Method: quicksort Memory: 25622kB
>>>> -> Bitmap Heap Scan on entity_compounddict2document (cost=2906.93..356652.81 rows=131997 width=133) (actual time=76.316..9038.485 rows=138165 loops=1)
>>>> Recheck Cond: ((name)::text = 'progesterone'::text)
>>>> -> Bitmap Index Scan on entity_compound2document_name (cost=0.00..2873.93 rows=131997 width=0) (actual time=40.913..40.913 rows=138165 loops=1)
>>>> Index Cond: ((name)::text = 'progesterone'::text)
>>>> Total runtime: 9296.815 ms
>>>>
>>>>
>>>> It has improved (I supose because of the lack of the join table) but still taking a lot of time... Anything I can do??
>>>>
>>>> Any help would be very appreciated. Thank you very much.
>>>
>>>
>>>
>>> Good to know performance has increased.
>>>
>>> "entity_compounddict2document" table goes through high INSERTS ?
>>>
>>> Can you help us know if the "helpval" column and "name" column have high duplicate values ? "n_distinct" value from pg_stats table would have that info.
>>>
>>> Below could be a possible workaround -
>>>
>>> As mentioned earlier in this email, a composite Index on name and hepval column might help. If the table does not go through lot of INSERTS, then consider performing a CLUSTER on the table using the same INDEX.
>>>
>>> Other recommendations -
>>>
>>> Please drop off all the Non-primary key Indexes which have 0 scans / hits. This would harm the DB and the DB server whilst maintenance and DML operations.
>>>
>>> Regards,
>>> Venkata Balaji N
>>>
>>> Fujitsu Australia
>>
>>
>>
>> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.
>>
>> **CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.
>>
>

Hi I think the problem is th heap scan of the table , that the backend have to do because the btree to bitmap conversion becomes lossy. Try to disable the enable_bitmapscan for the current session and rerun the query.

Mat Dba

Re: Query taking long time

From

"acanada"

Date:

06 March 2014, 14:45:28

Hello Mat,

Setting enable_bitmapscan to off doesn't really helps. It gets worse...

x=> SET enable_bitmapscan=off;

SET

x=> explain analyze select * from (select * from entity2document2 where name='ranitidine' ) as a order by a.hepval;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------

Sort (cost=18789.21..18800.70 rows=4595 width=131) (actual time=79965.282..79966.657 rows=13512 loops=1)

Sort Key: entity2document2.hepval

Sort Method: quicksort Memory: 2301kB

-> Index Scan using entity2document2_name on entity2document2 (cost=0.00..18509.70 rows=4595 width=131) (actual time=67.507..79945.362 rows=13512 loops=1)

Index Cond: ((name)::text = 'ranitidine'::text)

Total runtime: 79967.705 ms

(6 rows)

Any other idea?

Thank you very much for your help. Regards,

Andrés

El Mar 6, 2014, a las 2:11 PM, desmodemone escribió:

Il 05/mar/2014 00:36 "Venkata Balaji Nagothi" <vbnpgc@gmail.com> ha scritto:
>
> After looking at the distinct values, yes the composite Index on "name" and "hepval" is not recommended. That would worsen - its expected.
>
> We need to look for other possible work around. Please drop off the above Index. Let me see if i can drill further into this.
>
> Meanwhile - can you help us know the memory parameters (work_mem, temp_buffers etc) set ?
>
> Do you have any other processes effecting this query's performance ?
>
> Any info about your Disk, RAM, CPU would also help.
>
> Regards,
> Venkata Balaji N
>
> Fujitsu Australia
>
>
>
>
> Venkata Balaji N
>
> Sr. Database Administrator
> Fujitsu Australia
>
>
> On Tue, Mar 4, 2014 at 10:23 PM, acanada <acanada@cnio.es> wrote:
>>
>> Hello,
>>
>> I don't know if this helps to figure out what is the problem but after adding the multicolumn index on name and hepval, the performance is even worse (¿?). Ten times worse...
>>
>> explain analyze select * from (select * from entity_compounddict2document where name='progesterone') as a order by a.hepval;
>> QUERY PLAN
>> -------------------------------------------------------------------------------------------------------------------------------------------------------------
>> Sort (cost=422746.18..423143.94 rows=159104 width=133) (actual time=95769.674..95797.943 rows=138165 loops=1)
>> Sort Key: entity_compounddict2document.hepval
>> Sort Method: quicksort Memory: 25622kB
>> -> Bitmap Heap Scan on entity_compounddict2document (cost=3501.01..408999.90 rows=159104 width=133) (actual time=70.789..95519.258 rows=138165 loops=1)
>> Recheck Cond: ((name)::text = 'progesterone'::text)
>> -> Bitmap Index Scan on entity_compound2document_name (cost=0.00..3461.23 rows=159104 width=0) (actual time=35.174..35.174 rows=138165 loops=1)
>> Index Cond: ((name)::text = 'progesterone'::text)
>> Total runtime: 95811.838 ms
>> (8 rows)
>>
>> Any ideas please?
>>
>> Thank you
>> Andrés.
>>
>>
>>
>> El Mar 4, 2014, a las 12:28 AM, Venkata Balaji Nagothi escribió:
>>
>>> On Mon, Mar 3, 2014 at 9:17 PM, acanada <acanada@cnio.es> wrote:
>>>>
>>>> Hello,
>>>>
>>>> Thankyou for your answer.
>>>> I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in order to avoid the join clause. Now the schema is very simple. The query only implies one table:
>>>>
>>>> x=> \d+ entity_compounddict2document;
>>>> Table "public.entity_compounddict2document"
>>>> Column | Type | Modifiers | Storage | Description
>>>> ------------------+--------------------------------+-----------+----------+-------------
>>>> id | integer | not null | plain |
>>>> document_id | integer | | plain |
>>>> name | character varying(255) | | extended |
>>>> qualifier | character varying(255) | | extended |
>>>> tagMethod | character varying(255) | | extended |
>>>> created | timestamp(0) without time zone | | plain |
>>>> updated | timestamp(0) without time zone | | plain |
>>>> curation | integer | | plain |
>>>> hepval | double precision | | plain |
>>>> cardval | double precision | | plain |
>>>> nephval | double precision | | plain |
>>>> phosval | double precision | | plain |
>>>> patternCount | double precision | | plain |
>>>> ruleScore | double precision | | plain |
>>>> hepTermNormScore | double precision | | plain |
>>>> hepTermVarScore | double precision | | plain |
>>>> Indexes:
>>>> "entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
>>>> "entity_compound2document_cardval" btree (cardval)
>>>> "entity_compound2document_heptermnormscore" btree ("hepTermNormScore")
>>>> "entity_compound2document_heptermvarscore" btree ("hepTermVarScore")
>>>> "entity_compound2document_hepval" btree (hepval)
>>>> "entity_compound2document_name" btree (name)
>>>> "entity_compound2document_nephval" btree (nephval)
>>>> "entity_compound2document_patterncount" btree ("patternCount")
>>>> "entity_compound2document_phosval" btree (phosval)
>>>> "entity_compound2document_rulescore" btree ("ruleScore")
>>>> Has OIDs: no
>>>>
>>>> tablename | indexname | num_rows | table_size | index_size | unique | number_of_scans | tuples_read | tuples_fetched
>>>> entity_compounddict2document | entity_compound2document_cardval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_heptermnormscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_heptermvarscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_hepval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_name | 5.42452e+07 | 6763 MB | 1505 MB | Y | 24 | 178680 | 0
>>>> entity_compounddict2document | entity_compound2document_nephval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_patterncount | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_phosval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_rulescore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compounddict2document_pkey | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>>
>>>> The table has aprox. 54,000,000 rows
>>>> There are no NULLs in hepval field and pg_settings haven't changed. I also have done "analyze" to this table.
>>>>
>>>> I have simplified the query and added the last advise that you told me:
>>>>
>>>> Query:
>>>>
>>>> explain analyze select * from (select * from entity_compounddict2document where name='ranitidine') as a order by a.hepval;
>>>> QUERY PLAN
>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------
>>>> Sort (cost=11060.50..11067.55 rows=2822 width=133) (actual time=32715.097..32716.488 rows=13512 loops=1)
>>>> Sort Key: entity_compounddict2document.hepval
>>>> Sort Method: quicksort Memory: 2301kB
>>>> -> Bitmap Heap Scan on entity_compounddict2document (cost=73.82..10898.76 rows=2822 width=133) (actual time=6.034..32695.483 rows=13512 loops=1)
>>>> Recheck Cond: ((name)::text = 'ranitidine'::text)
>>>> -> Bitmap Index Scan on entity_compound2document_name (cost=0.00..73.12 rows=2822 width=0) (actual time=3.221..3.221 rows=13512 loops=1)
>>>> Index Cond: ((name)::text = 'ranitidine'::text)
>>>> Total runtime: 32717.548 ms
>>>>
>>>> Another query:
>>>> explain analyze select * from (select * from entity_compounddict2document where name='progesterone' ) as a order by a.hepval;
>>>>
>>>> QUERY PLAN
>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>> Sort (cost=367879.25..368209.24 rows=131997 width=133) (actual time=9262.887..9287.046 rows=138165 loops=1)
>>>> Sort Key: entity_compounddict2document.hepval
>>>> Sort Method: quicksort Memory: 25622kB
>>>> -> Bitmap Heap Scan on entity_compounddict2document (cost=2906.93..356652.81 rows=131997 width=133) (actual time=76.316..9038.485 rows=138165 loops=1)
>>>> Recheck Cond: ((name)::text = 'progesterone'::text)
>>>> -> Bitmap Index Scan on entity_compound2document_name (cost=0.00..2873.93 rows=131997 width=0) (actual time=40.913..40.913 rows=138165 loops=1)
>>>> Index Cond: ((name)::text = 'progesterone'::text)
>>>> Total runtime: 9296.815 ms
>>>>
>>>>
>>>> It has improved (I supose because of the lack of the join table) but still taking a lot of time... Anything I can do??
>>>>
>>>> Any help would be very appreciated. Thank you very much.
>>>
>>>
>>>
>>> Good to know performance has increased.
>>>
>>> "entity_compounddict2document" table goes through high INSERTS ?
>>>
>>> Can you help us know if the "helpval" column and "name" column have high duplicate values ? "n_distinct" value from pg_stats table would have that info.
>>>
>>> Below could be a possible workaround -
>>>
>>> As mentioned earlier in this email, a composite Index on name and hepval column might help. If the table does not go through lot of INSERTS, then consider performing a CLUSTER on the table using the same INDEX.
>>>
>>> Other recommendations -
>>>
>>> Please drop off all the Non-primary key Indexes which have 0 scans / hits. This would harm the DB and the DB server whilst maintenance and DML operations.
>>>
>>> Regards,
>>> Venkata Balaji N
>>>
>>> Fujitsu Australia
>>
>>
>>
>> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.
>>
>> **CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.
>>
>

Hi I think the problem is th heap scan of the table , that the backend have to do because the btree to bitmap conversion becomes lossy. Try to disable the enable_bitmapscan for the current session and rerun the query.
Mat Dba

Re: Query taking long time

From

Evgeniy Shishkin

Date:

07 March 2014, 09:39:42

> Hello Mat,
>
> Setting enable_bitmapscan to off doesn't really helps. It gets worse...
>
> x=> SET enable_bitmapscan=off;
> SET
> x=> explain analyze select * from (select * from entity2document2  where name='ranitidine' ) as a  order by a.hepval;
>                                                                            QUERY PLAN
                                          
>
----------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Sort  (cost=18789.21..18800.70 rows=4595 width=131) (actual time=79965.282..79966.657 rows=13512 loops=1)
>    Sort Key: entity2document2.hepval
>    Sort Method:  quicksort  Memory: 2301kB
>    ->  Index Scan using entity2document2_name on entity2document2  (cost=0.00..18509.70 rows=4595 width=131) (actual
time=67.507..79945.362rows=13512 loops=1) 
>          Index Cond: ((name)::text = 'ranitidine'::text)
>  Total runtime: 79967.705 ms
> (6 rows)
>
> Any other idea?
>

Please post your hw configuration. I think that your db is on disk and they are slow.



> Thank you very much for your help. Regards,
> Andrés
>
> El Mar 6, 2014, a las 2:11 PM, desmodemone escribió:
>
>>
>> Il 05/mar/2014 00:36 "Venkata Balaji Nagothi" <vbnpgc@gmail.com> ha scritto:
>> >
>> > After looking at the distinct values, yes the composite Index on "name" and "hepval" is not recommended. That
wouldworsen - its expected. 
>> >
>> > We need to look for other possible work around. Please drop off the above Index. Let me see if i can drill further
intothis. 
>> >
>> > Meanwhile - can you help us know the memory parameters (work_mem, temp_buffers etc) set ?
>> >
>> > Do you have any other processes effecting this query's performance ?
>> >
>> > Any info about your Disk, RAM, CPU would also help.
>> >
>> > Regards,
>> > Venkata Balaji N
>> >
>> > Fujitsu Australia
>> >
>> >
>> >
>> >
>> > Venkata Balaji N
>> >
>> > Sr. Database Administrator
>> > Fujitsu Australia
>> >
>> >
>> > On Tue, Mar 4, 2014 at 10:23 PM, acanada <acanada@cnio.es> wrote:
>> >>
>> >> Hello,
>> >>
>> >> I don't know if this helps to figure out what is the problem but after adding the multicolumn index on name and
hepval,the performance is even worse (¿?).  Ten times worse... 
>> >>
>> >> explain analyze select * from (select * from entity_compounddict2document  where name='progesterone') as a order
bya.hepval; 
>> >>                                                                          QUERY PLAN
                                           
>> >>
-------------------------------------------------------------------------------------------------------------------------------------------------------------
>> >>  Sort  (cost=422746.18..423143.94 rows=159104 width=133) (actual time=95769.674..95797.943 rows=138165 loops=1)
>> >>    Sort Key: entity_compounddict2document.hepval
>> >>    Sort Method:  quicksort  Memory: 25622kB
>> >>    ->  Bitmap Heap Scan on entity_compounddict2document  (cost=3501.01..408999.90 rows=159104 width=133) (actual
time=70.789..95519.258rows=138165 loops=1) 
>> >>          Recheck Cond: ((name)::text = 'progesterone'::text)
>> >>          ->  Bitmap Index Scan on entity_compound2document_name  (cost=0.00..3461.23 rows=159104 width=0) (actual
time=35.174..35.174rows=138165 loops=1) 
>> >>                Index Cond: ((name)::text = 'progesterone'::text)
>> >>  Total runtime: 95811.838 ms
>> >> (8 rows)
>> >>
>> >> Any ideas please?
>> >>
>> >> Thank you
>> >> Andrés.
>> >>
>> >>
>> >>
>> >> El Mar 4, 2014, a las 12:28 AM, Venkata Balaji Nagothi escribió:
>> >>
>> >>> On Mon, Mar 3, 2014 at 9:17 PM, acanada <acanada@cnio.es> wrote:
>> >>>>
>> >>>> Hello,
>> >>>>
>> >>>> Thankyou for your answer.
>> >>>> I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in
orderto avoid the join clause. Now the schema is very simple. The query only implies one table: 
>> >>>>
>> >>>> x=> \d+ entity_compounddict2document;
>> >>>>                       Table "public.entity_compounddict2document"
>> >>>>       Column      |              Type              | Modifiers | Storage  | Description
>> >>>> ------------------+--------------------------------+-----------+----------+-------------
>> >>>>  id               | integer                        | not null  | plain    |
>> >>>>  document_id      | integer                        |           | plain    |
>> >>>>  name             | character varying(255)         |           | extended |
>> >>>>  qualifier        | character varying(255)         |           | extended |
>> >>>>  tagMethod        | character varying(255)         |           | extended |
>> >>>>  created          | timestamp(0) without time zone |           | plain    |
>> >>>>  updated          | timestamp(0) without time zone |           | plain    |
>> >>>>  curation         | integer                        |           | plain    |
>> >>>>  hepval           | double precision               |           | plain    |
>> >>>>  cardval          | double precision               |           | plain    |
>> >>>>  nephval          | double precision               |           | plain    |
>> >>>>  phosval          | double precision               |           | plain    |
>> >>>>  patternCount     | double precision               |           | plain    |
>> >>>>  ruleScore        | double precision               |           | plain    |
>> >>>>  hepTermNormScore | double precision               |           | plain    |
>> >>>>  hepTermVarScore  | double precision               |           | plain    |
>> >>>> Indexes:
>> >>>>     "entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
>> >>>>     "entity_compound2document_cardval" btree (cardval)
>> >>>>     "entity_compound2document_heptermnormscore" btree ("hepTermNormScore")
>> >>>>     "entity_compound2document_heptermvarscore" btree ("hepTermVarScore")
>> >>>>     "entity_compound2document_hepval" btree (hepval)
>> >>>>     "entity_compound2document_name" btree (name)
>> >>>>     "entity_compound2document_nephval" btree (nephval)
>> >>>>     "entity_compound2document_patterncount" btree ("patternCount")
>> >>>>     "entity_compound2document_phosval" btree (phosval)
>> >>>>     "entity_compound2document_rulescore" btree ("ruleScore")
>> >>>> Has OIDs: no
>> >>>>
>> >>>>            tablename            |                   indexname                                              |
num_rows   | table_size  | index_size | unique | number_of_scans | tuples_read | tuples_fetched  
>> >>>>  entity_compounddict2document   | entity_compound2document_cardval               | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>> >>>>  entity_compounddict2document   | entity_compound2document_heptermnormscore      | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>> >>>>  entity_compounddict2document   | entity_compound2document_heptermvarscore       | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>> >>>>  entity_compounddict2document   | entity_compound2document_hepval                | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>> >>>>  entity_compounddict2document   | entity_compound2document_name                  | 5.42452e+07 | 6763 MB    |
1505MB    | Y      |              24 |      178680 |              0 
>> >>>>  entity_compounddict2document   | entity_compound2document_nephval               | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>> >>>>  entity_compounddict2document   | entity_compound2document_patterncount          | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>> >>>>  entity_compounddict2document   | entity_compound2document_phosval               | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>> >>>>  entity_compounddict2document   | entity_compound2document_rulescore             | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>> >>>>  entity_compounddict2document   | entity_compounddict2document_pkey              | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>> >>>>
>> >>>> The table has aprox. 54,000,000 rows
>> >>>> There are no NULLs in hepval field and pg_settings haven't changed. I also have done "analyze" to this table.
>> >>>>
>> >>>> I have simplified the query and added the last advise that you told me:
>> >>>>
>> >>>> Query:
>> >>>>
>> >>>>  explain analyze select * from (select * from entity_compounddict2document  where name='ranitidine') as a order
bya.hepval; 
>> >>>>                                                                       QUERY PLAN
                                      
>> >>>>
------------------------------------------------------------------------------------------------------------------------------------------------------
>> >>>>  Sort  (cost=11060.50..11067.55 rows=2822 width=133) (actual time=32715.097..32716.488 rows=13512 loops=1)
>> >>>>    Sort Key: entity_compounddict2document.hepval
>> >>>>    Sort Method:  quicksort  Memory: 2301kB
>> >>>>    ->  Bitmap Heap Scan on entity_compounddict2document  (cost=73.82..10898.76 rows=2822 width=133) (actual
time=6.034..32695.483rows=13512 loops=1) 
>> >>>>          Recheck Cond: ((name)::text = 'ranitidine'::text)
>> >>>>          ->  Bitmap Index Scan on entity_compound2document_name  (cost=0.00..73.12 rows=2822 width=0) (actual
time=3.221..3.221rows=13512 loops=1) 
>> >>>>                Index Cond: ((name)::text = 'ranitidine'::text)
>> >>>>  Total runtime: 32717.548 ms
>> >>>>
>> >>>> Another query:
>> >>>> explain analyze select * from (select * from entity_compounddict2document  where name='progesterone' ) as a
orderby a.hepval; 
>> >>>>
>> >>>> QUERY PLAN
>> >>>>
------------------------------------------------------------------------------------------------------------------------------------------------------------
>> >>>>  Sort  (cost=367879.25..368209.24 rows=131997 width=133) (actual time=9262.887..9287.046 rows=138165 loops=1)
>> >>>>    Sort Key: entity_compounddict2document.hepval
>> >>>>    Sort Method:  quicksort  Memory: 25622kB
>> >>>>    ->  Bitmap Heap Scan on entity_compounddict2document  (cost=2906.93..356652.81 rows=131997 width=133)
(actualtime=76.316..9038.485 rows=138165 loops=1) 
>> >>>>          Recheck Cond: ((name)::text = 'progesterone'::text)
>> >>>>          ->  Bitmap Index Scan on entity_compound2document_name  (cost=0.00..2873.93 rows=131997 width=0)
(actualtime=40.913..40.913 rows=138165 loops=1) 
>> >>>>                Index Cond: ((name)::text = 'progesterone'::text)
>> >>>>  Total runtime: 9296.815 ms
>> >>>>
>> >>>>
>> >>>> It has improved (I supose because of the lack of the join table) but still taking a lot of time... Anything I
cando?? 
>> >>>>
>> >>>> Any help would be very appreciated. Thank you very much.
>> >>>
>> >>>
>> >>>
>> >>> Good to know performance has increased.
>> >>>
>> >>> "entity_compounddict2document" table goes through high INSERTS ?
>> >>>
>> >>> Can you help us know if the "helpval" column and "name" column have high duplicate values ? "n_distinct" value
frompg_stats table would have that info.  
>> >>>
>> >>> Below could be a possible workaround -
>> >>>
>> >>> As mentioned earlier in this email, a composite Index on name and hepval column might help. If the table does
notgo through lot of INSERTS, then consider performing a CLUSTER on the table using the same INDEX. 
>> >>>
>> >>> Other recommendations -
>> >>>
>> >>> Please drop off all the Non-primary key Indexes which have 0 scans / hits. This would harm the DB and the DB
serverwhilst maintenance and DML operations. 
>> >>>
>> >>> Regards,
>> >>> Venkata Balaji N
>> >>>
>> >>> Fujitsu Australia
>> >>
>> >>
>> >>
>> >> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener
informaciónprotegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier
otrotipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se
ruegacomunicarlo al remitente y borrar el mensaje recibido. 
>> >>
>> >> **CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged
informationfor the sole use of the designated recipient named above. Distribution, reproduction or any other use of
thistransmission by any party other than the intended recipient is prohibited. If you are not the intended recipient
pleasecontact the sender and delete all copies. 
>> >>
>> >
>>
>>
>>
>> Hi I think the problem is th heap scan of the table , that the backend have to do because the btree to bitmap
conversionbecomes lossy. Try to disable the enable_bitmapscan for the current session and rerun the query. 
>>
>> Mat Dba
>>
>
>
>
> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información
protegidapara el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de
transmisiónpor parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega
comunicarloal remitente y borrar el mensaje recibido. 
>
> **CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged
informationfor the sole use of the designated recipient named above. Distribution, reproduction or any other use of
thistransmission by any party other than the intended recipient is prohibited. If you are not the intended recipient
pleasecontact the sender and delete all copies. 
>
>

Re: Query taking long time

From

"acanada"

Date:

07 March 2014, 09:46:48

El Mar 7, 2014, a las 10:39 AM, Evgeniy Shishkin escribió:

>
>> Hello Mat,
>>
>> Setting enable_bitmapscan to off doesn't really helps. It gets worse...
>>
>> x=> SET enable_bitmapscan=off;
>> SET
>> x=> explain analyze select * from (select * from entity2document2  where name='ranitidine' ) as a  order by
a.hepval;
>>                                                                           QUERY PLAN
                                          
>>
----------------------------------------------------------------------------------------------------------------------------------------------------------------
>> Sort  (cost=18789.21..18800.70 rows=4595 width=131) (actual time=79965.282..79966.657 rows=13512 loops=1)
>>   Sort Key: entity2document2.hepval
>>   Sort Method:  quicksort  Memory: 2301kB
>>   ->  Index Scan using entity2document2_name on entity2document2  (cost=0.00..18509.70 rows=4595 width=131) (actual
time=67.507..79945.362rows=13512 loops=1) 
>>         Index Cond: ((name)::text = 'ranitidine'::text)
>> Total runtime: 79967.705 ms
>> (6 rows)
>>
>> Any other idea?
>>
>
> Please post your hw configuration. I think that your db is on disk and they are slow.

The server has 2 processors quadcore, 10GB of RAM and data is located in a fiber disk of 2TB. It doesn't seem to be the
problem... 

Thank you

Andrés

>
>
>
>> Thank you very much for your help. Regards,
>> Andrés
>>
>> El Mar 6, 2014, a las 2:11 PM, desmodemone escribió:
>>
>>>
>>> Il 05/mar/2014 00:36 "Venkata Balaji Nagothi" <vbnpgc@gmail.com> ha scritto:
>>>>
>>>> After looking at the distinct values, yes the composite Index on "name" and "hepval" is not recommended. That
wouldworsen - its expected. 
>>>>
>>>> We need to look for other possible work around. Please drop off the above Index. Let me see if i can drill further
intothis. 
>>>>
>>>> Meanwhile - can you help us know the memory parameters (work_mem, temp_buffers etc) set ?
>>>>
>>>> Do you have any other processes effecting this query's performance ?
>>>>
>>>> Any info about your Disk, RAM, CPU would also help.
>>>>
>>>> Regards,
>>>> Venkata Balaji N
>>>>
>>>> Fujitsu Australia
>>>>
>>>>
>>>>
>>>>
>>>> Venkata Balaji N
>>>>
>>>> Sr. Database Administrator
>>>> Fujitsu Australia
>>>>
>>>>
>>>> On Tue, Mar 4, 2014 at 10:23 PM, acanada <acanada@cnio.es> wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> I don't know if this helps to figure out what is the problem but after adding the multicolumn index on name and
hepval,the performance is even worse (¿?).  Ten times worse... 
>>>>>
>>>>> explain analyze select * from (select * from entity_compounddict2document  where name='progesterone') as a order
bya.hepval; 
>>>>>                                                                         QUERY PLAN
                                          
>>>>>
-------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>> Sort  (cost=422746.18..423143.94 rows=159104 width=133) (actual time=95769.674..95797.943 rows=138165 loops=1)
>>>>>   Sort Key: entity_compounddict2document.hepval
>>>>>   Sort Method:  quicksort  Memory: 25622kB
>>>>>   ->  Bitmap Heap Scan on entity_compounddict2document  (cost=3501.01..408999.90 rows=159104 width=133) (actual
time=70.789..95519.258rows=138165 loops=1) 
>>>>>         Recheck Cond: ((name)::text = 'progesterone'::text)
>>>>>         ->  Bitmap Index Scan on entity_compound2document_name  (cost=0.00..3461.23 rows=159104 width=0) (actual
time=35.174..35.174rows=138165 loops=1) 
>>>>>               Index Cond: ((name)::text = 'progesterone'::text)
>>>>> Total runtime: 95811.838 ms
>>>>> (8 rows)
>>>>>
>>>>> Any ideas please?
>>>>>
>>>>> Thank you
>>>>> Andrés.
>>>>>
>>>>>
>>>>>
>>>>> El Mar 4, 2014, a las 12:28 AM, Venkata Balaji Nagothi escribió:
>>>>>
>>>>>> On Mon, Mar 3, 2014 at 9:17 PM, acanada <acanada@cnio.es> wrote:
>>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> Thankyou for your answer.
>>>>>>> I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in
orderto avoid the join clause. Now the schema is very simple. The query only implies one table: 
>>>>>>>
>>>>>>> x=> \d+ entity_compounddict2document;
>>>>>>>                      Table "public.entity_compounddict2document"
>>>>>>>      Column      |              Type              | Modifiers | Storage  | Description
>>>>>>> ------------------+--------------------------------+-----------+----------+-------------
>>>>>>> id               | integer                        | not null  | plain    |
>>>>>>> document_id      | integer                        |           | plain    |
>>>>>>> name             | character varying(255)         |           | extended |
>>>>>>> qualifier        | character varying(255)         |           | extended |
>>>>>>> tagMethod        | character varying(255)         |           | extended |
>>>>>>> created          | timestamp(0) without time zone |           | plain    |
>>>>>>> updated          | timestamp(0) without time zone |           | plain    |
>>>>>>> curation         | integer                        |           | plain    |
>>>>>>> hepval           | double precision               |           | plain    |
>>>>>>> cardval          | double precision               |           | plain    |
>>>>>>> nephval          | double precision               |           | plain    |
>>>>>>> phosval          | double precision               |           | plain    |
>>>>>>> patternCount     | double precision               |           | plain    |
>>>>>>> ruleScore        | double precision               |           | plain    |
>>>>>>> hepTermNormScore | double precision               |           | plain    |
>>>>>>> hepTermVarScore  | double precision               |           | plain    |
>>>>>>> Indexes:
>>>>>>>    "entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
>>>>>>>    "entity_compound2document_cardval" btree (cardval)
>>>>>>>    "entity_compound2document_heptermnormscore" btree ("hepTermNormScore")
>>>>>>>    "entity_compound2document_heptermvarscore" btree ("hepTermVarScore")
>>>>>>>    "entity_compound2document_hepval" btree (hepval)
>>>>>>>    "entity_compound2document_name" btree (name)
>>>>>>>    "entity_compound2document_nephval" btree (nephval)
>>>>>>>    "entity_compound2document_patterncount" btree ("patternCount")
>>>>>>>    "entity_compound2document_phosval" btree (phosval)
>>>>>>>    "entity_compound2document_rulescore" btree ("ruleScore")
>>>>>>> Has OIDs: no
>>>>>>>
>>>>>>>           tablename            |                   indexname                                              |
num_rows   | table_size  | index_size | unique | number_of_scans | tuples_read | tuples_fetched  
>>>>>>> entity_compounddict2document   | entity_compound2document_cardval               | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>>>>>>> entity_compounddict2document   | entity_compound2document_heptermnormscore      | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>>>>>>> entity_compounddict2document   | entity_compound2document_heptermvarscore       | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>>>>>>> entity_compounddict2document   | entity_compound2document_hepval                | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>>>>>>> entity_compounddict2document   | entity_compound2document_name                  | 5.42452e+07 | 6763 MB    |
1505MB    | Y      |              24 |      178680 |              0 
>>>>>>> entity_compounddict2document   | entity_compound2document_nephval               | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>>>>>>> entity_compounddict2document   | entity_compound2document_patterncount          | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>>>>>>> entity_compounddict2document   | entity_compound2document_phosval               | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>>>>>>> entity_compounddict2document   | entity_compound2document_rulescore             | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>>>>>>> entity_compounddict2document   | entity_compounddict2document_pkey              | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>>>>>>>
>>>>>>> The table has aprox. 54,000,000 rows
>>>>>>> There are no NULLs in hepval field and pg_settings haven't changed. I also have done "analyze" to this table.
>>>>>>>
>>>>>>> I have simplified the query and added the last advise that you told me:
>>>>>>>
>>>>>>> Query:
>>>>>>>
>>>>>>> explain analyze select * from (select * from entity_compounddict2document  where name='ranitidine') as a order
bya.hepval; 
>>>>>>>                                                                      QUERY PLAN
                                     
>>>>>>>
------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>>>> Sort  (cost=11060.50..11067.55 rows=2822 width=133) (actual time=32715.097..32716.488 rows=13512 loops=1)
>>>>>>>   Sort Key: entity_compounddict2document.hepval
>>>>>>>   Sort Method:  quicksort  Memory: 2301kB
>>>>>>>   ->  Bitmap Heap Scan on entity_compounddict2document  (cost=73.82..10898.76 rows=2822 width=133) (actual
time=6.034..32695.483rows=13512 loops=1) 
>>>>>>>         Recheck Cond: ((name)::text = 'ranitidine'::text)
>>>>>>>         ->  Bitmap Index Scan on entity_compound2document_name  (cost=0.00..73.12 rows=2822 width=0) (actual
time=3.221..3.221rows=13512 loops=1) 
>>>>>>>               Index Cond: ((name)::text = 'ranitidine'::text)
>>>>>>> Total runtime: 32717.548 ms
>>>>>>>
>>>>>>> Another query:
>>>>>>> explain analyze select * from (select * from entity_compounddict2document  where name='progesterone' ) as a
orderby a.hepval; 
>>>>>>>
>>>>>>> QUERY PLAN
>>>>>>>
------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>>>> Sort  (cost=367879.25..368209.24 rows=131997 width=133) (actual time=9262.887..9287.046 rows=138165 loops=1)
>>>>>>>   Sort Key: entity_compounddict2document.hepval
>>>>>>>   Sort Method:  quicksort  Memory: 25622kB
>>>>>>>   ->  Bitmap Heap Scan on entity_compounddict2document  (cost=2906.93..356652.81 rows=131997 width=133) (actual
time=76.316..9038.485rows=138165 loops=1) 
>>>>>>>         Recheck Cond: ((name)::text = 'progesterone'::text)
>>>>>>>         ->  Bitmap Index Scan on entity_compound2document_name  (cost=0.00..2873.93 rows=131997 width=0)
(actualtime=40.913..40.913 rows=138165 loops=1) 
>>>>>>>               Index Cond: ((name)::text = 'progesterone'::text)
>>>>>>> Total runtime: 9296.815 ms
>>>>>>>
>>>>>>>
>>>>>>> It has improved (I supose because of the lack of the join table) but still taking a lot of time... Anything I
cando?? 
>>>>>>>
>>>>>>> Any help would be very appreciated. Thank you very much.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Good to know performance has increased.
>>>>>>
>>>>>> "entity_compounddict2document" table goes through high INSERTS ?
>>>>>>
>>>>>> Can you help us know if the "helpval" column and "name" column have high duplicate values ? "n_distinct" value
frompg_stats table would have that info.  
>>>>>>
>>>>>> Below could be a possible workaround -
>>>>>>
>>>>>> As mentioned earlier in this email, a composite Index on name and hepval column might help. If the table does
notgo through lot of INSERTS, then consider performing a CLUSTER on the table using the same INDEX. 
>>>>>>
>>>>>> Other recommendations -
>>>>>>
>>>>>> Please drop off all the Non-primary key Indexes which have 0 scans / hits. This would harm the DB and the DB
serverwhilst maintenance and DML operations. 
>>>>>>
>>>>>> Regards,
>>>>>> Venkata Balaji N
>>>>>>
>>>>>> Fujitsu Australia
>>>>>
>>>>>
>>>>>
>>>>> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener
informaciónprotegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier
otrotipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se
ruegacomunicarlo al remitente y borrar el mensaje recibido. 
>>>>>
>>>>> **CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged
informationfor the sole use of the designated recipient named above. Distribution, reproduction or any other use of
thistransmission by any party other than the intended recipient is prohibited. If you are not the intended recipient
pleasecontact the sender and delete all copies. 
>>>>>
>>>>
>>>
>>>
>>>
>>> Hi I think the problem is th heap scan of the table , that the backend have to do because the btree to bitmap
conversionbecomes lossy. Try to disable the enable_bitmapscan for the current session and rerun the query. 
>>>
>>> Mat Dba
>>>
>>
>>
>>
>> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener
informaciónprotegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier
otrotipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se
ruegacomunicarlo al remitente y borrar el mensaje recibido. 
>>
>> **CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged
informationfor the sole use of the designated recipient named above. Distribution, reproduction or any other use of
thistransmission by any party other than the intended recipient is prohibited. If you are not the intended recipient
pleasecontact the sender and delete all copies. 
>>
>>
>


**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información
protegidapara el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de
transmisiónpor parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega
comunicarloal remitente y borrar el mensaje recibido. 
**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged
informationfor the sole use of the designated recipient named above. Distribution, reproduction or any other use of
thistransmission by any party other than the intended recipient is prohibited. If you are not the intended recipient
pleasecontact the sender and delete all copies.

Re: Query taking long time

From

Evgeniy Shishkin

Date:

07 March 2014, 10:03:59

On 07 Mar 2014, at 12:46, acanada <acanada@cnio.es> wrote:

>
> El Mar 7, 2014, a las 10:39 AM, Evgeniy Shishkin escribió:
>
>>
>>> Hello Mat,
>>>
>>> Setting enable_bitmapscan to off doesn't really helps. It gets worse...
>>>
>>> x=> SET enable_bitmapscan=off;
>>> SET
>>> x=> explain analyze select * from (select * from entity2document2  where name='ranitidine' ) as a  order by
a.hepval;
>>>                                                                          QUERY PLAN
                                          
>>>
----------------------------------------------------------------------------------------------------------------------------------------------------------------
>>> Sort  (cost=18789.21..18800.70 rows=4595 width=131) (actual time=79965.282..79966.657 rows=13512 loops=1)
>>>  Sort Key: entity2document2.hepval
>>>  Sort Method:  quicksort  Memory: 2301kB
>>>  ->  Index Scan using entity2document2_name on entity2document2  (cost=0.00..18509.70 rows=4595 width=131) (actual
time=67.507..79945.362rows=13512 loops=1) 
>>>        Index Cond: ((name)::text = 'ranitidine'::text)
>>> Total runtime: 79967.705 ms
>>> (6 rows)
>>>
>>> Any other idea?
>>>
>>
>> Please post your hw configuration. I think that your db is on disk and they are slow.
>
> The server has 2 processors quadcore, 10GB of RAM and data is located in a fiber disk of 2TB. It doesn't seem to be
theproblem…  

And your database size is?

Also do this timings get better in consecutive runs?


>

> Thank you
>
> Andrés
>
>>
>>
>>
>>> Thank you very much for your help. Regards,
>>> Andrés
>>>
>>> El Mar 6, 2014, a las 2:11 PM, desmodemone escribió:
>>>
>>>>
>>>> Il 05/mar/2014 00:36 "Venkata Balaji Nagothi" <vbnpgc@gmail.com> ha scritto:
>>>>>
>>>>> After looking at the distinct values, yes the composite Index on "name" and "hepval" is not recommended. That
wouldworsen - its expected. 
>>>>>
>>>>> We need to look for other possible work around. Please drop off the above Index. Let me see if i can drill
furtherinto this. 
>>>>>
>>>>> Meanwhile - can you help us know the memory parameters (work_mem, temp_buffers etc) set ?
>>>>>
>>>>> Do you have any other processes effecting this query's performance ?
>>>>>
>>>>> Any info about your Disk, RAM, CPU would also help.
>>>>>
>>>>> Regards,
>>>>> Venkata Balaji N
>>>>>
>>>>> Fujitsu Australia
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Venkata Balaji N
>>>>>
>>>>> Sr. Database Administrator
>>>>> Fujitsu Australia
>>>>>
>>>>>
>>>>> On Tue, Mar 4, 2014 at 10:23 PM, acanada <acanada@cnio.es> wrote:
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I don't know if this helps to figure out what is the problem but after adding the multicolumn index on name and
hepval,the performance is even worse (¿?). Ten times worse... 
>>>>>>
>>>>>> explain analyze select * from (select * from entity_compounddict2document  where name='progesterone') as a order
bya.hepval; 
>>>>>>                                                                        QUERY PLAN
                                          
>>>>>>
-------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>>> Sort  (cost=422746.18..423143.94 rows=159104 width=133) (actual time=95769.674..95797.943 rows=138165 loops=1)
>>>>>>  Sort Key: entity_compounddict2document.hepval
>>>>>>  Sort Method:  quicksort  Memory: 25622kB
>>>>>>  ->  Bitmap Heap Scan on entity_compounddict2document  (cost=3501.01..408999.90 rows=159104 width=133) (actual
time=70.789..95519.258rows=138165 loops=1) 
>>>>>>        Recheck Cond: ((name)::text = 'progesterone'::text)
>>>>>>        ->  Bitmap Index Scan on entity_compound2document_name  (cost=0.00..3461.23 rows=159104 width=0) (actual
time=35.174..35.174rows=138165 loops=1) 
>>>>>>              Index Cond: ((name)::text = 'progesterone'::text)
>>>>>> Total runtime: 95811.838 ms
>>>>>> (8 rows)
>>>>>>
>>>>>> Any ideas please?
>>>>>>
>>>>>> Thank you
>>>>>> Andrés.
>>>>>>
>>>>>>
>>>>>>
>>>>>> El Mar 4, 2014, a las 12:28 AM, Venkata Balaji Nagothi escribió:
>>>>>>
>>>>>>> On Mon, Mar 3, 2014 at 9:17 PM, acanada <acanada@cnio.es> wrote:
>>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> Thankyou for your answer.
>>>>>>>> I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in
orderto avoid the join clause. Now the schema is very simple. The query only implies one table: 
>>>>>>>>
>>>>>>>> x=> \d+ entity_compounddict2document;
>>>>>>>>                     Table "public.entity_compounddict2document"
>>>>>>>>     Column      |              Type              | Modifiers | Storage  | Description
>>>>>>>> ------------------+--------------------------------+-----------+----------+-------------
>>>>>>>> id               | integer                        | not null  | plain    |
>>>>>>>> document_id      | integer                        |           | plain    |
>>>>>>>> name             | character varying(255)         |           | extended |
>>>>>>>> qualifier        | character varying(255)         |           | extended |
>>>>>>>> tagMethod        | character varying(255)         |           | extended |
>>>>>>>> created          | timestamp(0) without time zone |           | plain    |
>>>>>>>> updated          | timestamp(0) without time zone |           | plain    |
>>>>>>>> curation         | integer                        |           | plain    |
>>>>>>>> hepval           | double precision               |           | plain    |
>>>>>>>> cardval          | double precision               |           | plain    |
>>>>>>>> nephval          | double precision               |           | plain    |
>>>>>>>> phosval          | double precision               |           | plain    |
>>>>>>>> patternCount     | double precision               |           | plain    |
>>>>>>>> ruleScore        | double precision               |           | plain    |
>>>>>>>> hepTermNormScore | double precision               |           | plain    |
>>>>>>>> hepTermVarScore  | double precision               |           | plain    |
>>>>>>>> Indexes:
>>>>>>>>   "entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
>>>>>>>>   "entity_compound2document_cardval" btree (cardval)
>>>>>>>>   "entity_compound2document_heptermnormscore" btree ("hepTermNormScore")
>>>>>>>>   "entity_compound2document_heptermvarscore" btree ("hepTermVarScore")
>>>>>>>>   "entity_compound2document_hepval" btree (hepval)
>>>>>>>>   "entity_compound2document_name" btree (name)
>>>>>>>>   "entity_compound2document_nephval" btree (nephval)
>>>>>>>>   "entity_compound2document_patterncount" btree ("patternCount")
>>>>>>>>   "entity_compound2document_phosval" btree (phosval)
>>>>>>>>   "entity_compound2document_rulescore" btree ("ruleScore")
>>>>>>>> Has OIDs: no
>>>>>>>>
>>>>>>>>          tablename            |                   indexname                                              |
num_rows   | table_size  | index_size | unique | number_of_scans | tuples_read | tuples_fetched  
>>>>>>>> entity_compounddict2document   | entity_compound2document_cardval               | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |      0 
>>>>>>>> entity_compounddict2document   | entity_compound2document_heptermnormscore      | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>>>>>>>> entity_compounddict2document   | entity_compound2document_heptermvarscore       | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>>>>>>>> entity_compounddict2document   | entity_compound2document_hepval                | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |      0 
>>>>>>>> entity_compounddict2document   | entity_compound2document_name                  | 5.42452e+07 | 6763 MB    |
1505MB    | Y      |              24 |      178680 |              0 
>>>>>>>> entity_compounddict2document   | entity_compound2document_nephval               | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |     0 
>>>>>>>> entity_compounddict2document   | entity_compound2document_patterncount          | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |   0 
>>>>>>>> entity_compounddict2document   | entity_compound2document_phosval               | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |      0 
>>>>>>>> entity_compounddict2document   | entity_compound2document_rulescore             | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |     0 
>>>>>>>> entity_compounddict2document   | entity_compounddict2document_pkey              | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |      0 
>>>>>>>>
>>>>>>>> The table has aprox. 54,000,000 rows
>>>>>>>> There are no NULLs in hepval field and pg_settings haven't changed. I also have done "analyze" to this table.
>>>>>>>>
>>>>>>>> I have simplified the query and added the last advise that you told me:
>>>>>>>>
>>>>>>>> Query:
>>>>>>>>
>>>>>>>> explain analyze select * from (select * from entity_compounddict2document  where name='ranitidine') as a order
bya.hepval; 
>>>>>>>>                                                                     QUERY PLAN
                                     
>>>>>>>>
------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>>>>> Sort  (cost=11060.50..11067.55 rows=2822 width=133) (actual time=32715.097..32716.488 rows=13512 loops=1)
>>>>>>>>  Sort Key: entity_compounddict2document.hepval
>>>>>>>>  Sort Method:  quicksort  Memory: 2301kB
>>>>>>>>  ->  Bitmap Heap Scan on entity_compounddict2document  (cost=73.82..10898.76 rows=2822 width=133) (actual
time=6.034..32695.483rows=13512 loops=1) 
>>>>>>>>        Recheck Cond: ((name)::text = 'ranitidine'::text)
>>>>>>>>        ->  Bitmap Index Scan on entity_compound2document_name  (cost=0.00..73.12 rows=2822 width=0) (actual
time=3.221..3.221rows=13512 loops=1) 
>>>>>>>>              Index Cond: ((name)::text = 'ranitidine'::text)
>>>>>>>> Total runtime: 32717.548 ms
>>>>>>>>
>>>>>>>> Another query:
>>>>>>>> explain analyze select * from (select * from entity_compounddict2document  where name='progesterone' ) as a
orderby a.hepval; 
>>>>>>>>
>>>>>>>> QUERY PLAN
>>>>>>>>
------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>>>>> Sort  (cost=367879.25..368209.24 rows=131997 width=133) (actual time=9262.887..9287.046 rows=138165 loops=1)
>>>>>>>>  Sort Key: entity_compounddict2document.hepval
>>>>>>>>  Sort Method:  quicksort  Memory: 25622kB
>>>>>>>>  ->  Bitmap Heap Scan on entity_compounddict2document  (cost=2906.93..356652.81 rows=131997 width=133) (actual
time=76.316..9038.485rows=138165 loops=1) 
>>>>>>>>        Recheck Cond: ((name)::text = 'progesterone'::text)
>>>>>>>>        ->  Bitmap Index Scan on entity_compound2document_name  (cost=0.00..2873.93 rows=131997 width=0)
(actualtime=40.913..40.913 rows=138165 loops=1) 
>>>>>>>>              Index Cond: ((name)::text = 'progesterone'::text)
>>>>>>>> Total runtime: 9296.815 ms
>>>>>>>>
>>>>>>>>
>>>>>>>> It has improved (I supose because of the lack of the join table) but still taking a lot of time... Anything I
cando?? 
>>>>>>>>
>>>>>>>> Any help would be very appreciated. Thank you very much.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Good to know performance has increased.
>>>>>>>
>>>>>>> "entity_compounddict2document" table goes through high INSERTS ?
>>>>>>>
>>>>>>> Can you help us know if the "helpval" column and "name" column have high duplicate values ? "n_distinct" value
frompg_stats table would have that info.  
>>>>>>>
>>>>>>> Below could be a possible workaround -
>>>>>>>
>>>>>>> As mentioned earlier in this email, a composite Index on name and hepval column might help. If the table does
notgo through lot of INSERTS, then consider performing a CLUSTER on the table using the same INDEX. 
>>>>>>>
>>>>>>> Other recommendations -
>>>>>>>
>>>>>>> Please drop off all the Non-primary key Indexes which have 0 scans / hits. This would harm the DB and the DB
serverwhilst maintenance and DML operations. 
>>>>>>>
>>>>>>> Regards,
>>>>>>> Venkata Balaji N
>>>>>>>
>>>>>>> Fujitsu Australia
>>>>>>
>>>>>>
>>>>>>
>>>>>> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener
informaciónprotegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier
otrotipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se
ruegacomunicarlo al remitente y borrar el mensaje recibido. 
>>>>>>
>>>>>> **CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged
informationfor the sole use of the designated recipient named above. Distribution, reproduction or any other use of
thistransmission by any party other than the intended recipient is prohibited. If you are not the intended recipient
pleasecontact the sender and delete all copies. 
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> Hi I think the problem is th heap scan of the table , that the backend have to do because the btree to bitmap
conversionbecomes lossy. Try to disable the enable_bitmapscan for the current session and rerun the query. 
>>>>
>>>> Mat Dba
>>>>
>>>
>>>
>>>
>>> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener
informaciónprotegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier
otrotipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se
ruegacomunicarlo al remitente y borrar el mensaje recibido. 
>>>
>>> **CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged
informationfor the sole use of the designated recipient named above. Distribution, reproduction or any other use of
thistransmission by any party other than the intended recipient is prohibited. If you are not the intended recipient
pleasecontact the sender and delete all copies. 
>>>
>>>
>>
>
>
> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información
protegidapara el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de
transmisiónpor parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega
comunicarloal remitente y borrar el mensaje recibido. 
> **CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged
informationfor the sole use of the designated recipient named above. Distribution, reproduction or any other use of
thistransmission by any party other than the intended recipient is prohibited. If you are not the intended recipient
pleasecontact the sender and delete all copies.

Re: Query taking long time

From

"acanada"

Date:

07 March 2014, 10:18:11

El Mar 7, 2014, a las 11:03 AM, Evgeniy Shishkin escribió:

>
> On 07 Mar 2014, at 12:46, acanada <acanada@cnio.es> wrote:
>
>>
>> El Mar 7, 2014, a las 10:39 AM, Evgeniy Shishkin escribió:
>>
>>>
>>>> Hello Mat,
>>>>
>>>> Setting enable_bitmapscan to off doesn't really helps. It gets worse...
>>>>
>>>> x=> SET enable_bitmapscan=off;
>>>> SET
>>>> x=> explain analyze select * from (select * from entity2document2  where name='ranitidine' ) as a  order by
a.hepval;
>>>>                                                                         QUERY PLAN
                                          
>>>>
----------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>> Sort  (cost=18789.21..18800.70 rows=4595 width=131) (actual time=79965.282..79966.657 rows=13512 loops=1)
>>>> Sort Key: entity2document2.hepval
>>>> Sort Method:  quicksort  Memory: 2301kB
>>>> ->  Index Scan using entity2document2_name on entity2document2  (cost=0.00..18509.70 rows=4595 width=131) (actual
time=67.507..79945.362rows=13512 loops=1) 
>>>>       Index Cond: ((name)::text = 'ranitidine'::text)
>>>> Total runtime: 79967.705 ms
>>>> (6 rows)
>>>>
>>>> Any other idea?
>>>>
>>>
>>> Please post your hw configuration. I think that your db is on disk and they are slow.
>>
>> The server has 2 processors quadcore, 10GB of RAM and data is located in a fiber disk of 2TB. It doesn't seem to be
theproblem…  
>
> And your database size is?
>
> Also do this timings get better in consecutive runs?
>

The table entity2document2 has 30GB. In consecutive runs it gets much better... 30ms aprox.



>
>>
>
>> Thank you
>>
>> Andrés
>>
>>>
>>>
>>>
>>>> Thank you very much for your help. Regards,
>>>> Andrés
>>>>
>>>> El Mar 6, 2014, a las 2:11 PM, desmodemone escribió:
>>>>
>>>>>
>>>>> Il 05/mar/2014 00:36 "Venkata Balaji Nagothi" <vbnpgc@gmail.com> ha scritto:
>>>>>>
>>>>>> After looking at the distinct values, yes the composite Index on "name" and "hepval" is not recommended. That
wouldworsen - its expected. 
>>>>>>
>>>>>> We need to look for other possible work around. Please drop off the above Index. Let me see if i can drill
furtherinto this. 
>>>>>>
>>>>>> Meanwhile - can you help us know the memory parameters (work_mem, temp_buffers etc) set ?
>>>>>>
>>>>>> Do you have any other processes effecting this query's performance ?
>>>>>>
>>>>>> Any info about your Disk, RAM, CPU would also help.
>>>>>>
>>>>>> Regards,
>>>>>> Venkata Balaji N
>>>>>>
>>>>>> Fujitsu Australia
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Venkata Balaji N
>>>>>>
>>>>>> Sr. Database Administrator
>>>>>> Fujitsu Australia
>>>>>>
>>>>>>
>>>>>> On Tue, Mar 4, 2014 at 10:23 PM, acanada <acanada@cnio.es> wrote:
>>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> I don't know if this helps to figure out what is the problem but after adding the multicolumn index on name and
hepval,the performance is even worse (¿?). Ten times worse... 
>>>>>>>
>>>>>>> explain analyze select * from (select * from entity_compounddict2document  where name='progesterone') as a
orderby a.hepval; 
>>>>>>>                                                                       QUERY PLAN
                                          
>>>>>>>
-------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>>>> Sort  (cost=422746.18..423143.94 rows=159104 width=133) (actual time=95769.674..95797.943 rows=138165 loops=1)
>>>>>>> Sort Key: entity_compounddict2document.hepval
>>>>>>> Sort Method:  quicksort  Memory: 25622kB
>>>>>>> ->  Bitmap Heap Scan on entity_compounddict2document  (cost=3501.01..408999.90 rows=159104 width=133) (actual
time=70.789..95519.258rows=138165 loops=1) 
>>>>>>>       Recheck Cond: ((name)::text = 'progesterone'::text)
>>>>>>>       ->  Bitmap Index Scan on entity_compound2document_name  (cost=0.00..3461.23 rows=159104 width=0) (actual
time=35.174..35.174rows=138165 loops=1) 
>>>>>>>             Index Cond: ((name)::text = 'progesterone'::text)
>>>>>>> Total runtime: 95811.838 ms
>>>>>>> (8 rows)
>>>>>>>
>>>>>>> Any ideas please?
>>>>>>>
>>>>>>> Thank you
>>>>>>> Andrés.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> El Mar 4, 2014, a las 12:28 AM, Venkata Balaji Nagothi escribió:
>>>>>>>
>>>>>>>> On Mon, Mar 3, 2014 at 9:17 PM, acanada <acanada@cnio.es> wrote:
>>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> Thankyou for your answer.
>>>>>>>>> I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in
orderto avoid the join clause. Now the schema is very simple. The query only implies one table: 
>>>>>>>>>
>>>>>>>>> x=> \d+ entity_compounddict2document;
>>>>>>>>>                    Table "public.entity_compounddict2document"
>>>>>>>>>    Column      |              Type              | Modifiers | Storage  | Description
>>>>>>>>> ------------------+--------------------------------+-----------+----------+-------------
>>>>>>>>> id               | integer                        | not null  | plain    |
>>>>>>>>> document_id      | integer                        |           | plain    |
>>>>>>>>> name             | character varying(255)         |           | extended |
>>>>>>>>> qualifier        | character varying(255)         |           | extended |
>>>>>>>>> tagMethod        | character varying(255)         |           | extended |
>>>>>>>>> created          | timestamp(0) without time zone |           | plain    |
>>>>>>>>> updated          | timestamp(0) without time zone |           | plain    |
>>>>>>>>> curation         | integer                        |           | plain    |
>>>>>>>>> hepval           | double precision               |           | plain    |
>>>>>>>>> cardval          | double precision               |           | plain    |
>>>>>>>>> nephval          | double precision               |           | plain    |
>>>>>>>>> phosval          | double precision               |           | plain    |
>>>>>>>>> patternCount     | double precision               |           | plain    |
>>>>>>>>> ruleScore        | double precision               |           | plain    |
>>>>>>>>> hepTermNormScore | double precision               |           | plain    |
>>>>>>>>> hepTermVarScore  | double precision               |           | plain    |
>>>>>>>>> Indexes:
>>>>>>>>>  "entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
>>>>>>>>>  "entity_compound2document_cardval" btree (cardval)
>>>>>>>>>  "entity_compound2document_heptermnormscore" btree ("hepTermNormScore")
>>>>>>>>>  "entity_compound2document_heptermvarscore" btree ("hepTermVarScore")
>>>>>>>>>  "entity_compound2document_hepval" btree (hepval)
>>>>>>>>>  "entity_compound2document_name" btree (name)
>>>>>>>>>  "entity_compound2document_nephval" btree (nephval)
>>>>>>>>>  "entity_compound2document_patterncount" btree ("patternCount")
>>>>>>>>>  "entity_compound2document_phosval" btree (phosval)
>>>>>>>>>  "entity_compound2document_rulescore" btree ("ruleScore")
>>>>>>>>> Has OIDs: no
>>>>>>>>>
>>>>>>>>>         tablename            |                   indexname                                              |
num_rows   | table_size  | index_size | unique | number_of_scans | tuples_read | tuples_fetched  
>>>>>>>>> entity_compounddict2document   | entity_compound2document_cardval               | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |      0 
>>>>>>>>> entity_compounddict2document   | entity_compound2document_heptermnormscore      | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>>>>>>>>> entity_compounddict2document   | entity_compound2document_heptermvarscore       | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |              0 
>>>>>>>>> entity_compounddict2document   | entity_compound2document_hepval                | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |      0 
>>>>>>>>> entity_compounddict2document   | entity_compound2document_name                  | 5.42452e+07 | 6763 MB    |
1505MB    | Y      |              24 |      178680 |              0 
>>>>>>>>> entity_compounddict2document   | entity_compound2document_nephval               | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |     0 
>>>>>>>>> entity_compounddict2document   | entity_compound2document_patterncount          | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |   0 
>>>>>>>>> entity_compounddict2document   | entity_compound2document_phosval               | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |      0 
>>>>>>>>> entity_compounddict2document   | entity_compound2document_rulescore             | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |     0 
>>>>>>>>> entity_compounddict2document   | entity_compounddict2document_pkey              | 5.42452e+07 | 6763 MB    |
1162MB    | Y      |               0 |           0 |      0 
>>>>>>>>>
>>>>>>>>> The table has aprox. 54,000,000 rows
>>>>>>>>> There are no NULLs in hepval field and pg_settings haven't changed. I also have done "analyze" to this table.
>>>>>>>>>
>>>>>>>>> I have simplified the query and added the last advise that you told me:
>>>>>>>>>
>>>>>>>>> Query:
>>>>>>>>>
>>>>>>>>> explain analyze select * from (select * from entity_compounddict2document  where name='ranitidine') as a
orderby a.hepval; 
>>>>>>>>>                                                                    QUERY PLAN
                                     
>>>>>>>>>
------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>>>>>> Sort  (cost=11060.50..11067.55 rows=2822 width=133) (actual time=32715.097..32716.488 rows=13512 loops=1)
>>>>>>>>> Sort Key: entity_compounddict2document.hepval
>>>>>>>>> Sort Method:  quicksort  Memory: 2301kB
>>>>>>>>> ->  Bitmap Heap Scan on entity_compounddict2document  (cost=73.82..10898.76 rows=2822 width=133) (actual
time=6.034..32695.483rows=13512 loops=1) 
>>>>>>>>>       Recheck Cond: ((name)::text = 'ranitidine'::text)
>>>>>>>>>       ->  Bitmap Index Scan on entity_compound2document_name  (cost=0.00..73.12 rows=2822 width=0) (actual
time=3.221..3.221rows=13512 loops=1) 
>>>>>>>>>             Index Cond: ((name)::text = 'ranitidine'::text)
>>>>>>>>> Total runtime: 32717.548 ms
>>>>>>>>>
>>>>>>>>> Another query:
>>>>>>>>> explain analyze select * from (select * from entity_compounddict2document  where name='progesterone' ) as a
orderby a.hepval; 
>>>>>>>>>
>>>>>>>>> QUERY PLAN
>>>>>>>>>
------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>>>>>> Sort  (cost=367879.25..368209.24 rows=131997 width=133) (actual time=9262.887..9287.046 rows=138165 loops=1)
>>>>>>>>> Sort Key: entity_compounddict2document.hepval
>>>>>>>>> Sort Method:  quicksort  Memory: 25622kB
>>>>>>>>> ->  Bitmap Heap Scan on entity_compounddict2document  (cost=2906.93..356652.81 rows=131997 width=133) (actual
time=76.316..9038.485rows=138165 loops=1) 
>>>>>>>>>       Recheck Cond: ((name)::text = 'progesterone'::text)
>>>>>>>>>       ->  Bitmap Index Scan on entity_compound2document_name  (cost=0.00..2873.93 rows=131997 width=0)
(actualtime=40.913..40.913 rows=138165 loops=1) 
>>>>>>>>>             Index Cond: ((name)::text = 'progesterone'::text)
>>>>>>>>> Total runtime: 9296.815 ms
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> It has improved (I supose because of the lack of the join table) but still taking a lot of time... Anything I
cando?? 
>>>>>>>>>
>>>>>>>>> Any help would be very appreciated. Thank you very much.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Good to know performance has increased.
>>>>>>>>
>>>>>>>> "entity_compounddict2document" table goes through high INSERTS ?
>>>>>>>>
>>>>>>>> Can you help us know if the "helpval" column and "name" column have high duplicate values ? "n_distinct" value
frompg_stats table would have that info.  
>>>>>>>>
>>>>>>>> Below could be a possible workaround -
>>>>>>>>
>>>>>>>> As mentioned earlier in this email, a composite Index on name and hepval column might help. If the table does
notgo through lot of INSERTS, then consider performing a CLUSTER on the table using the same INDEX. 
>>>>>>>>
>>>>>>>> Other recommendations -
>>>>>>>>
>>>>>>>> Please drop off all the Non-primary key Indexes which have 0 scans / hits. This would harm the DB and the DB
serverwhilst maintenance and DML operations. 
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Venkata Balaji N
>>>>>>>>
>>>>>>>> Fujitsu Australia
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener
informaciónprotegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier
otrotipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se
ruegacomunicarlo al remitente y borrar el mensaje recibido. 
>>>>>>>
>>>>>>> **CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged
informationfor the sole use of the designated recipient named above. Distribution, reproduction or any other use of
thistransmission by any party other than the intended recipient is prohibited. If you are not the intended recipient
pleasecontact the sender and delete all copies. 
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Hi I think the problem is th heap scan of the table , that the backend have to do because the btree to bitmap
conversionbecomes lossy. Try to disable the enable_bitmapscan for the current session and rerun the query. 
>>>>>
>>>>> Mat Dba
>>>>>
>>>>
>>>>
>>>>
>>>> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener
informaciónprotegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier
otrotipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se
ruegacomunicarlo al remitente y borrar el mensaje recibido. 
>>>>
>>>> **CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged
informationfor the sole use of the designated recipient named above. Distribution, reproduction or any other use of
thistransmission by any party other than the intended recipient is prohibited. If you are not the intended recipient
pleasecontact the sender and delete all copies. 
>>>>
>>>>
>>>
>>
>>
>> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener
informaciónprotegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier
otrotipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se
ruegacomunicarlo al remitente y borrar el mensaje recibido. 
>> **CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged
informationfor the sole use of the designated recipient named above. Distribution, reproduction or any other use of
thistransmission by any party other than the intended recipient is prohibited. If you are not the intended recipient
pleasecontact the sender and delete all copies. 
>


**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información
protegidapara el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de
transmisiónpor parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega
comunicarloal remitente y borrar el mensaje recibido. 
**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged
informationfor the sole use of the designated recipient named above. Distribution, reproduction or any other use of
thistransmission by any party other than the intended recipient is prohibited. If you are not the intended recipient
pleasecontact the sender and delete all copies.

Re: Query taking long time

From

Evgeniy Shishkin

Date:

10 March 2014, 14:45:15

On 07 Mar 2014, at 13:18, acanada <acanada@cnio.es> wrote:

> The table entity2document2 has 30GB. In consecutive runs it gets much better... 30ms apron.

So you just benchmarking your hard drives with random iops.

You need more ram and faster disks.

Re: Query taking long time

From

"acanada"

Date:

10 March 2014, 16:30:34

Hello Evgeniy!

I can move the database to another server...
This is the cat of /proc/cpuinfo. Does it have enough power or should I go for a better one??

(It has 32 processors like this one):

cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 45
model name      : Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
stepping        : 7
microcode       : 0x70d
cpu MHz         : 1200.000
cache size      : 20480 KB
physical id     : 0
siblings        : 16
core id         : 0
cpu cores       : 8
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse
sse2ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc
aperfmperfpni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt
tsc_deadline_timeraes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid 
bogomips        : 5187.62
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

free
             total       used       free     shared    buffers     cached
Mem:      65901148   32702336   33198812          0     264936   20625024
-/+ buffers/cache:   11812376   54088772
Swap:    134217724     413088  133804636

Thank you for your help,
Andrés

El Mar 10, 2014, a las 3:45 PM, Evgeniy Shishkin escribió:

>
> On 07 Mar 2014, at 13:18, acanada <acanada@cnio.es> wrote:
>
>> The table entity2document2 has 30GB. In consecutive runs it gets much better... 30ms apron.
>
> So you just benchmarking your hard drives with random iops.
>
> You need more ram and faster disks.

**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información
protegidapara el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de
transmisiónpor parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega
comunicarloal remitente y borrar el mensaje recibido. 
**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged
informationfor the sole use of the designated recipient named above. Distribution, reproduction or any other use of
thistransmission by any party other than the intended recipient is prohibited. If you are not the intended recipient
pleasecontact the sender and delete all copies.

Re: Query taking long time

From

Venkata Balaji Nagothi

Date:

10 March 2014, 19:49:42

On Tue, Mar 11, 2014 at 3:30 AM, acanada <acanada@cnio.es> wrote:

Hello Evgeniy!

I can move the database to another server...
This is the cat of /proc/cpuinfo. Does it have enough power or should I go for a better one??

(It has 32 processors like this one):

cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 45
model name : Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
stepping : 7
microcode : 0x70d
cpu MHz : 1200.000
cache size : 20480 KB
physical id : 0
siblings : 16
core id : 0
cpu cores : 8
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips : 5187.62
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:

free
total used free shared buffers cached
Mem: 65901148 32702336 33198812 0 264936 20625024
-/+ buffers/cache: 11812376 54088772
Swap: 134217724 413088 133804636

Please let us know the Disk configuration of the server. Also, any other processes use this server heavily ?

Venkata Balaji N

Sr. Database Administrator

Fujitsu Australia

Re: Query taking long time

From

desmodemone

Date:

10 March 2014, 20:22:55

Hello Andres,
with enable_bitmapscan=off; could you do :

explain ( analyze , buffers ) select * from entity2document2 where name='ranitidine' ;

I think it's interesting to understand how much it's clustered the table entity2document2.

infact the query extract 13512 rows in 79945.362 ms around 4 ms for row, and I suspect the table is not well clustered on that column, so every time the

process is asking for a different page of the table or the i/o system have some problem.

Moreover, another point it's : how much it's big ? the rows are arounf 94M , but how much it's big ? it's important the average row length

Have a nice day

2014-03-06 15:45 GMT+01:00 acanada <acanada@cnio.es>:

Hello Mat,

Setting enable_bitmapscan to off doesn't really helps. It gets worse...

x=> SET enable_bitmapscan=off;
SET
x=> explain analyze select * from (select * from entity2document2 where name='ranitidine' ) as a order by a.hepval;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=18789.21..18800.70 rows=4595 width=131) (actual time=79965.282..79966.657 rows=13512 loops=1)
Sort Key: entity2document2.hepval
Sort Method: quicksort Memory: 2301kB
-> Index Scan using entity2document2_name on entity2document2 (cost=0.00..18509.70 rows=4595 width=131) (actual time=67.507..79945.362 rows=13512 loops=1)
Index Cond: ((name)::text = 'ranitidine'::text)
Total runtime: 79967.705 ms
(6 rows)

Any other idea?

Thank you very much for your help. Regards,
Andrés

El Mar 6, 2014, a las 2:11 PM, desmodemone escribió:

Il 05/mar/2014 00:36 "Venkata Balaji Nagothi" <vbnpgc@gmail.com> ha scritto:
>
> After looking at the distinct values, yes the composite Index on "name" and "hepval" is not recommended. That would worsen - its expected.
>
> We need to look for other possible work around. Please drop off the above Index. Let me see if i can drill further into this.
>
> Meanwhile - can you help us know the memory parameters (work_mem, temp_buffers etc) set ?
>
> Do you have any other processes effecting this query's performance ?
>
> Any info about your Disk, RAM, CPU would also help.
>
> Regards,
> Venkata Balaji N
>
> Fujitsu Australia
>
>
>
>
> Venkata Balaji N
>
> Sr. Database Administrator
> Fujitsu Australia
>
>
> On Tue, Mar 4, 2014 at 10:23 PM, acanada <acanada@cnio.es> wrote:
>>
>> Hello,
>>
>> I don't know if this helps to figure out what is the problem but after adding the multicolumn index on name and hepval, the performance is even worse (¿?). Ten times worse...
>>
>> explain analyze select * from (select * from entity_compounddict2document where name='progesterone') as a order by a.hepval;
>> QUERY PLAN
>> -------------------------------------------------------------------------------------------------------------------------------------------------------------
>> Sort (cost=422746.18..423143.94 rows=159104 width=133) (actual time=95769.674..95797.943 rows=138165 loops=1)
>> Sort Key: entity_compounddict2document.hepval
>> Sort Method: quicksort Memory: 25622kB
>> -> Bitmap Heap Scan on entity_compounddict2document (cost=3501.01..408999.90 rows=159104 width=133) (actual time=70.789..95519.258 rows=138165 loops=1)
>> Recheck Cond: ((name)::text = 'progesterone'::text)
>> -> Bitmap Index Scan on entity_compound2document_name (cost=0.00..3461.23 rows=159104 width=0) (actual time=35.174..35.174 rows=138165 loops=1)
>> Index Cond: ((name)::text = 'progesterone'::text)
>> Total runtime: 95811.838 ms
>> (8 rows)
>>
>> Any ideas please?
>>
>> Thank you
>> Andrés.
>>
>>
>>
>> El Mar 4, 2014, a las 12:28 AM, Venkata Balaji Nagothi escribió:
>>
>>> On Mon, Mar 3, 2014 at 9:17 PM, acanada <acanada@cnio.es> wrote:
>>>>
>>>> Hello,
>>>>
>>>> Thankyou for your answer.
>>>> I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in order to avoid the join clause. Now the schema is very simple. The query only implies one table:
>>>>
>>>> x=> \d+ entity_compounddict2document;
>>>> Table "public.entity_compounddict2document"
>>>> Column | Type | Modifiers | Storage | Description
>>>> ------------------+--------------------------------+-----------+----------+-------------
>>>> id | integer | not null | plain |
>>>> document_id | integer | | plain |
>>>> name | character varying(255) | | extended |
>>>> qualifier | character varying(255) | | extended |
>>>> tagMethod | character varying(255) | | extended |
>>>> created | timestamp(0) without time zone | | plain |
>>>> updated | timestamp(0) without time zone | | plain |
>>>> curation | integer | | plain |
>>>> hepval | double precision | | plain |
>>>> cardval | double precision | | plain |
>>>> nephval | double precision | | plain |
>>>> phosval | double precision | | plain |
>>>> patternCount | double precision | | plain |
>>>> ruleScore | double precision | | plain |
>>>> hepTermNormScore | double precision | | plain |
>>>> hepTermVarScore | double precision | | plain |
>>>> Indexes:
>>>> "entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
>>>> "entity_compound2document_cardval" btree (cardval)
>>>> "entity_compound2document_heptermnormscore" btree ("hepTermNormScore")
>>>> "entity_compound2document_heptermvarscore" btree ("hepTermVarScore")
>>>> "entity_compound2document_hepval" btree (hepval)
>>>> "entity_compound2document_name" btree (name)
>>>> "entity_compound2document_nephval" btree (nephval)
>>>> "entity_compound2document_patterncount" btree ("patternCount")
>>>> "entity_compound2document_phosval" btree (phosval)
>>>> "entity_compound2document_rulescore" btree ("ruleScore")
>>>> Has OIDs: no
>>>>
>>>> tablename | indexname | num_rows | table_size | index_size | unique | number_of_scans | tuples_read | tuples_fetched
>>>> entity_compounddict2document | entity_compound2document_cardval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_heptermnormscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_heptermvarscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_hepval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_name | 5.42452e+07 | 6763 MB | 1505 MB | Y | 24 | 178680 | 0
>>>> entity_compounddict2document | entity_compound2document_nephval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_patterncount | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_phosval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_rulescore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compounddict2document_pkey | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>>
>>>> The table has aprox. 54,000,000 rows
>>>> There are no NULLs in hepval field and pg_settings haven't changed. I also have done "analyze" to this table.
>>>>
>>>> I have simplified the query and added the last advise that you told me:
>>>>
>>>> Query:
>>>>
>>>> explain analyze select * from (select * from entity_compounddict2document where name='ranitidine') as a order by a.hepval;
>>>> QUERY PLAN
>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------
>>>> Sort (cost=11060.50..11067.55 rows=2822 width=133) (actual time=32715.097..32716.488 rows=13512 loops=1)
>>>> Sort Key: entity_compounddict2document.hepval
>>>> Sort Method: quicksort Memory: 2301kB
>>>> -> Bitmap Heap Scan on entity_compounddict2document (cost=73.82..10898.76 rows=2822 width=133) (actual time=6.034..32695.483 rows=13512 loops=1)
>>>> Recheck Cond: ((name)::text = 'ranitidine'::text)
>>>> -> Bitmap Index Scan on entity_compound2document_name (cost=0.00..73.12 rows=2822 width=0) (actual time=3.221..3.221 rows=13512 loops=1)
>>>> Index Cond: ((name)::text = 'ranitidine'::text)
>>>> Total runtime: 32717.548 ms
>>>>
>>>> Another query:
>>>> explain analyze select * from (select * from entity_compounddict2document where name='progesterone' ) as a order by a.hepval;
>>>>
>>>> QUERY PLAN
>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>> Sort (cost=367879.25..368209.24 rows=131997 width=133) (actual time=9262.887..9287.046 rows=138165 loops=1)
>>>> Sort Key: entity_compounddict2document.hepval
>>>> Sort Method: quicksort Memory: 25622kB
>>>> -> Bitmap Heap Scan on entity_compounddict2document (cost=2906.93..356652.81 rows=131997 width=133) (actual time=76.316..9038.485 rows=138165 loops=1)
>>>> Recheck Cond: ((name)::text = 'progesterone'::text)
>>>> -> Bitmap Index Scan on entity_compound2document_name (cost=0.00..2873.93 rows=131997 width=0) (actual time=40.913..40.913 rows=138165 loops=1)
>>>> Index Cond: ((name)::text = 'progesterone'::text)
>>>> Total runtime: 9296.815 ms
>>>>
>>>>
>>>> It has improved (I supose because of the lack of the join table) but still taking a lot of time... Anything I can do??
>>>>
>>>> Any help would be very appreciated. Thank you very much.
>>>
>>>
>>>
>>> Good to know performance has increased.
>>>
>>> "entity_compounddict2document" table goes through high INSERTS ?
>>>
>>> Can you help us know if the "helpval" column and "name" column have high duplicate values ? "n_distinct" value from pg_stats table would have that info.
>>>
>>> Below could be a possible workaround -
>>>
>>> As mentioned earlier in this email, a composite Index on name and hepval column might help. If the table does not go through lot of INSERTS, then consider performing a CLUSTER on the table using the same INDEX.
>>>
>>> Other recommendations -
>>>
>>> Please drop off all the Non-primary key Indexes which have 0 scans / hits. This would harm the DB and the DB server whilst maintenance and DML operations.
>>>
>>> Regards,
>>> Venkata Balaji N
>>>
>>> Fujitsu Australia
>>
>>
>>
>> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.
>>
>> **CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.
>>
>

Hi I think the problem is th heap scan of the table , that the backend have to do because the btree to bitmap conversion becomes lossy. Try to disable the enable_bitmapscan for the current session and rerun the query.
Mat Dba

**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.

**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.

Re: Query taking long time

From

"acanada"

Date:

11 March 2014, 15:57:01

Hello,

I cannot do explain (analyze, buffers) since I am on 8.3 postgres version.

I am migrating to the new server and upgrading it.

Once it is ready again I will post the explain query here.

The new disk is SATA disk with 5TB, raid 0 or 1...

lspci | grep -i raid

00:1f.2 RAID bus controller: Intel Corporation C600/X79 series chipset SATA RAID Controller (rev 05)

All database is 200GB and the table entity2document2 is

x=> select pg_size_pretty(pg_relation_size('entity2document2'));

pg_size_pretty

----------------

11 GB

(1 row)

x=> select pg_size_pretty(pg_total_relation_size('entity2document2'));

pg_size_pretty

----------------

29 GB

(1 row)

The index of the name column:

x=> select pg_size_pretty(pg_relation_size('entity2document2_name'));

pg_size_pretty

----------------

2550 MB

(1 row)

I am tunning the new server with this parameters...

shared_buffers = 15000MB

work_mem = 1000MB

maintenance_work_mem = 2000MB

Any other parameter that should be modified?

Thank you for your help!

Andrés

El Mar 10, 2014, a las 9:22 PM, desmodemone escribió:

Hello Andres,
with enable_bitmapscan=off; could you do :

explain ( analyze , buffers ) select * from entity2document2 where name='ranitidine' ;

I think it's interesting to understand how much it's clustered the table entity2document2.
infact the query extract 13512 rows in 79945.362 ms around 4 ms for row, and I suspect the table is not well clustered on that column, so every time the
process is asking for a different page of the table or the i/o system have some problem.

Moreover, another point it's : how much it's big ? the rows are arounf 94M , but how much it's big ? it's important the average row length

Have a nice day

2014-03-06 15:45 GMT+01:00 acanada <acanada@cnio.es>:
Hello Mat,

Setting enable_bitmapscan to off doesn't really helps. It gets worse...

x=> SET enable_bitmapscan=off;
SET
x=> explain analyze select * from (select * from entity2document2 where name='ranitidine' ) as a order by a.hepval;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=18789.21..18800.70 rows=4595 width=131) (actual time=79965.282..79966.657 rows=13512 loops=1)
Sort Key: entity2document2.hepval
Sort Method: quicksort Memory: 2301kB
-> Index Scan using entity2document2_name on entity2document2 (cost=0.00..18509.70 rows=4595 width=131) (actual time=67.507..79945.362 rows=13512 loops=1)
Index Cond: ((name)::text = 'ranitidine'::text)
Total runtime: 79967.705 ms
(6 rows)

Any other idea?

Thank you very much for your help. Regards,
Andrés

El Mar 6, 2014, a las 2:11 PM, desmodemone escribió:

Il 05/mar/2014 00:36 "Venkata Balaji Nagothi" <vbnpgc@gmail.com> ha scritto:
>
> After looking at the distinct values, yes the composite Index on "name" and "hepval" is not recommended. That would worsen - its expected.
>
> We need to look for other possible work around. Please drop off the above Index. Let me see if i can drill further into this.
>
> Meanwhile - can you help us know the memory parameters (work_mem, temp_buffers etc) set ?
>
> Do you have any other processes effecting this query's performance ?
>
> Any info about your Disk, RAM, CPU would also help.
>
> Regards,
> Venkata Balaji N
>
> Fujitsu Australia
>
>
>
>
> Venkata Balaji N
>
> Sr. Database Administrator
> Fujitsu Australia
>
>
> On Tue, Mar 4, 2014 at 10:23 PM, acanada <acanada@cnio.es> wrote:
>>
>> Hello,
>>
>> I don't know if this helps to figure out what is the problem but after adding the multicolumn index on name and hepval, the performance is even worse (¿?). Ten times worse...
>>
>> explain analyze select * from (select * from entity_compounddict2document where name='progesterone') as a order by a.hepval;
>> QUERY PLAN
>> -------------------------------------------------------------------------------------------------------------------------------------------------------------
>> Sort (cost=422746.18..423143.94 rows=159104 width=133) (actual time=95769.674..95797.943 rows=138165 loops=1)
>> Sort Key: entity_compounddict2document.hepval
>> Sort Method: quicksort Memory: 25622kB
>> -> Bitmap Heap Scan on entity_compounddict2document (cost=3501.01..408999.90 rows=159104 width=133) (actual time=70.789..95519.258 rows=138165 loops=1)
>> Recheck Cond: ((name)::text = 'progesterone'::text)
>> -> Bitmap Index Scan on entity_compound2document_name (cost=0.00..3461.23 rows=159104 width=0) (actual time=35.174..35.174 rows=138165 loops=1)
>> Index Cond: ((name)::text = 'progesterone'::text)
>> Total runtime: 95811.838 ms
>> (8 rows)
>>
>> Any ideas please?
>>
>> Thank you
>> Andrés.
>>
>>
>>
>> El Mar 4, 2014, a las 12:28 AM, Venkata Balaji Nagothi escribió:
>>
>>> On Mon, Mar 3, 2014 at 9:17 PM, acanada <acanada@cnio.es> wrote:
>>>>
>>>> Hello,
>>>>
>>>> Thankyou for your answer.
>>>> I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in order to avoid the join clause. Now the schema is very simple. The query only implies one table:
>>>>
>>>> x=> \d+ entity_compounddict2document;
>>>> Table "public.entity_compounddict2document"
>>>> Column | Type | Modifiers | Storage | Description
>>>> ------------------+--------------------------------+-----------+----------+-------------
>>>> id | integer | not null | plain |
>>>> document_id | integer | | plain |
>>>> name | character varying(255) | | extended |
>>>> qualifier | character varying(255) | | extended |
>>>> tagMethod | character varying(255) | | extended |
>>>> created | timestamp(0) without time zone | | plain |
>>>> updated | timestamp(0) without time zone | | plain |
>>>> curation | integer | | plain |
>>>> hepval | double precision | | plain |
>>>> cardval | double precision | | plain |
>>>> nephval | double precision | | plain |
>>>> phosval | double precision | | plain |
>>>> patternCount | double precision | | plain |
>>>> ruleScore | double precision | | plain |
>>>> hepTermNormScore | double precision | | plain |
>>>> hepTermVarScore | double precision | | plain |
>>>> Indexes:
>>>> "entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
>>>> "entity_compound2document_cardval" btree (cardval)
>>>> "entity_compound2document_heptermnormscore" btree ("hepTermNormScore")
>>>> "entity_compound2document_heptermvarscore" btree ("hepTermVarScore")
>>>> "entity_compound2document_hepval" btree (hepval)
>>>> "entity_compound2document_name" btree (name)
>>>> "entity_compound2document_nephval" btree (nephval)
>>>> "entity_compound2document_patterncount" btree ("patternCount")
>>>> "entity_compound2document_phosval" btree (phosval)
>>>> "entity_compound2document_rulescore" btree ("ruleScore")
>>>> Has OIDs: no
>>>>
>>>> tablename | indexname | num_rows | table_size | index_size | unique | number_of_scans | tuples_read | tuples_fetched
>>>> entity_compounddict2document | entity_compound2document_cardval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_heptermnormscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_heptermvarscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_hepval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_name | 5.42452e+07 | 6763 MB | 1505 MB | Y | 24 | 178680 | 0
>>>> entity_compounddict2document | entity_compound2document_nephval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_patterncount | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_phosval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_rulescore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compounddict2document_pkey | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>>
>>>> The table has aprox. 54,000,000 rows
>>>> There are no NULLs in hepval field and pg_settings haven't changed. I also have done "analyze" to this table.
>>>>
>>>> I have simplified the query and added the last advise that you told me:
>>>>
>>>> Query:
>>>>
>>>> explain analyze select * from (select * from entity_compounddict2document where name='ranitidine') as a order by a.hepval;
>>>> QUERY PLAN
>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------
>>>> Sort (cost=11060.50..11067.55 rows=2822 width=133) (actual time=32715.097..32716.488 rows=13512 loops=1)
>>>> Sort Key: entity_compounddict2document.hepval
>>>> Sort Method: quicksort Memory: 2301kB
>>>> -> Bitmap Heap Scan on entity_compounddict2document (cost=73.82..10898.76 rows=2822 width=133) (actual time=6.034..32695.483 rows=13512 loops=1)
>>>> Recheck Cond: ((name)::text = 'ranitidine'::text)
>>>> -> Bitmap Index Scan on entity_compound2document_name (cost=0.00..73.12 rows=2822 width=0) (actual time=3.221..3.221 rows=13512 loops=1)
>>>> Index Cond: ((name)::text = 'ranitidine'::text)
>>>> Total runtime: 32717.548 ms
>>>>
>>>> Another query:
>>>> explain analyze select * from (select * from entity_compounddict2document where name='progesterone' ) as a order by a.hepval;
>>>>
>>>> QUERY PLAN
>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>> Sort (cost=367879.25..368209.24 rows=131997 width=133) (actual time=9262.887..9287.046 rows=138165 loops=1)
>>>> Sort Key: entity_compounddict2document.hepval
>>>> Sort Method: quicksort Memory: 25622kB
>>>> -> Bitmap Heap Scan on entity_compounddict2document (cost=2906.93..356652.81 rows=131997 width=133) (actual time=76.316..9038.485 rows=138165 loops=1)
>>>> Recheck Cond: ((name)::text = 'progesterone'::text)
>>>> -> Bitmap Index Scan on entity_compound2document_name (cost=0.00..2873.93 rows=131997 width=0) (actual time=40.913..40.913 rows=138165 loops=1)
>>>> Index Cond: ((name)::text = 'progesterone'::text)
>>>> Total runtime: 9296.815 ms
>>>>
>>>>
>>>> It has improved (I supose because of the lack of the join table) but still taking a lot of time... Anything I can do??
>>>>
>>>> Any help would be very appreciated. Thank you very much.
>>>
>>>
>>>
>>> Good to know performance has increased.
>>>
>>> "entity_compounddict2document" table goes through high INSERTS ?
>>>
>>> Can you help us know if the "helpval" column and "name" column have high duplicate values ? "n_distinct" value from pg_stats table would have that info.
>>>
>>> Below could be a possible workaround -
>>>
>>> As mentioned earlier in this email, a composite Index on name and hepval column might help. If the table does not go through lot of INSERTS, then consider performing a CLUSTER on the table using the same INDEX.
>>>
>>> Other recommendations -
>>>
>>> Please drop off all the Non-primary key Indexes which have 0 scans / hits. This would harm the DB and the DB server whilst maintenance and DML operations.
>>>
>>> Regards,
>>> Venkata Balaji N
>>>
>>> Fujitsu Australia
>>
>>
>>
>> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.
>>
>> **CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.
>>
>

Hi I think the problem is th heap scan of the table , that the backend have to do because the btree to bitmap conversion becomes lossy. Try to disable the enable_bitmapscan for the current session and rerun the query.
Mat Dba

**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.

**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.

Re: Query taking long time

From

Evgeny Shishkin

Date:

11 March 2014, 23:16:15

Hello,

new server with more ram will definitely help to keep your working set in memory.

But if you want your queries be fast on cold (on disk) data, then you need more/faster disks.

And work_mem = 1000MB is too much, better set to 32MB so you don’t get OOM Killer.

And may be slightly lower shared_buffers.

On 11 Mar 2014, at 18:56, acanada <acanada@cnio.es> wrote:

Hello,

I cannot do explain (analyze, buffers) since I am on 8.3 postgres version.
I am migrating to the new server and upgrading it.
Once it is ready again I will post the explain query here.
The new disk is SATA disk with 5TB, raid 0 or 1...
lspci | grep -i raid
00:1f.2 RAID bus controller: Intel Corporation C600/X79 series chipset SATA RAID Controller (rev 05)

All database is 200GB and the table entity2document2 is

x=> select pg_size_pretty(pg_relation_size('entity2document2'));
pg_size_pretty
----------------
11 GB
(1 row)

x=> select pg_size_pretty(pg_total_relation_size('entity2document2'));
pg_size_pretty
----------------
29 GB
(1 row)

The index of the name column:
x=> select pg_size_pretty(pg_relation_size('entity2document2_name'));
pg_size_pretty
----------------
2550 MB
(1 row)

I am tunning the new server with this parameters...
shared_buffers = 15000MB
work_mem = 1000MB
maintenance_work_mem = 2000MB

Any other parameter that should be modified?

Thank you for your help!
Andrés

El Mar 10, 2014, a las 9:22 PM, desmodemone escribió:

Hello Andres,
with enable_bitmapscan=off; could you do :

explain ( analyze , buffers ) select * from entity2document2 where name='ranitidine' ;

I think it's interesting to understand how much it's clustered the table entity2document2.
infact the query extract 13512 rows in 79945.362 ms around 4 ms for row, and I suspect the table is not well clustered on that column, so every time the
process is asking for a different page of the table or the i/o system have some problem.

Moreover, another point it's : how much it's big ? the rows are arounf 94M , but how much it's big ? it's important the average row length

Have a nice day

2014-03-06 15:45 GMT+01:00 acanada <acanada@cnio.es>:
Hello Mat,

Setting enable_bitmapscan to off doesn't really helps. It gets worse...

x=> SET enable_bitmapscan=off;
SET
x=> explain analyze select * from (select * from entity2document2 where name='ranitidine' ) as a order by a.hepval;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=18789.21..18800.70 rows=4595 width=131) (actual time=79965.282..79966.657 rows=13512 loops=1)
Sort Key: entity2document2.hepval
Sort Method: quicksort Memory: 2301kB
-> Index Scan using entity2document2_name on entity2document2 (cost=0.00..18509.70 rows=4595 width=131) (actual time=67.507..79945.362 rows=13512 loops=1)
Index Cond: ((name)::text = 'ranitidine'::text)
Total runtime: 79967.705 ms
(6 rows)

Any other idea?

Thank you very much for your help. Regards,
Andrés

El Mar 6, 2014, a las 2:11 PM, desmodemone escribió:

Il 05/mar/2014 00:36 "Venkata Balaji Nagothi" <vbnpgc@gmail.com> ha scritto:
>
> After looking at the distinct values, yes the composite Index on "name" and "hepval" is not recommended. That would worsen - its expected.
>
> We need to look for other possible work around. Please drop off the above Index. Let me see if i can drill further into this.
>
> Meanwhile - can you help us know the memory parameters (work_mem, temp_buffers etc) set ?
>
> Do you have any other processes effecting this query's performance ?
>
> Any info about your Disk, RAM, CPU would also help.
>
> Regards,
> Venkata Balaji N
>
> Fujitsu Australia
>
>
>
>
> Venkata Balaji N
>
> Sr. Database Administrator
> Fujitsu Australia
>
>
> On Tue, Mar 4, 2014 at 10:23 PM, acanada <acanada@cnio.es> wrote:
>>
>> Hello,
>>
>> I don't know if this helps to figure out what is the problem but after adding the multicolumn index on name and hepval, the performance is even worse (¿?). Ten times worse...
>>
>> explain analyze select * from (select * from entity_compounddict2document where name='progesterone') as a order by a.hepval;
>> QUERY PLAN
>> -------------------------------------------------------------------------------------------------------------------------------------------------------------
>> Sort (cost=422746.18..423143.94 rows=159104 width=133) (actual time=95769.674..95797.943 rows=138165 loops=1)
>> Sort Key: entity_compounddict2document.hepval
>> Sort Method: quicksort Memory: 25622kB
>> -> Bitmap Heap Scan on entity_compounddict2document (cost=3501.01..408999.90 rows=159104 width=133) (actual time=70.789..95519.258 rows=138165 loops=1)
>> Recheck Cond: ((name)::text = 'progesterone'::text)
>> -> Bitmap Index Scan on entity_compound2document_name (cost=0.00..3461.23 rows=159104 width=0) (actual time=35.174..35.174 rows=138165 loops=1)
>> Index Cond: ((name)::text = 'progesterone'::text)
>> Total runtime: 95811.838 ms
>> (8 rows)
>>
>> Any ideas please?
>>
>> Thank you
>> Andrés.
>>
>>
>>
>> El Mar 4, 2014, a las 12:28 AM, Venkata Balaji Nagothi escribió:
>>
>>> On Mon, Mar 3, 2014 at 9:17 PM, acanada <acanada@cnio.es> wrote:
>>>>
>>>> Hello,
>>>>
>>>> Thankyou for your answer.
>>>> I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in order to avoid the join clause. Now the schema is very simple. The query only implies one table:
>>>>
>>>> x=> \d+ entity_compounddict2document;
>>>> Table "public.entity_compounddict2document"
>>>> Column | Type | Modifiers | Storage | Description
>>>> ------------------+--------------------------------+-----------+----------+-------------
>>>> id | integer | not null | plain |
>>>> document_id | integer | | plain |
>>>> name | character varying(255) | | extended |
>>>> qualifier | character varying(255) | | extended |
>>>> tagMethod | character varying(255) | | extended |
>>>> created | timestamp(0) without time zone | | plain |
>>>> updated | timestamp(0) without time zone | | plain |
>>>> curation | integer | | plain |
>>>> hepval | double precision | | plain |
>>>> cardval | double precision | | plain |
>>>> nephval | double precision | | plain |
>>>> phosval | double precision | | plain |
>>>> patternCount | double precision | | plain |
>>>> ruleScore | double precision | | plain |
>>>> hepTermNormScore | double precision | | plain |
>>>> hepTermVarScore | double precision | | plain |
>>>> Indexes:
>>>> "entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
>>>> "entity_compound2document_cardval" btree (cardval)
>>>> "entity_compound2document_heptermnormscore" btree ("hepTermNormScore")
>>>> "entity_compound2document_heptermvarscore" btree ("hepTermVarScore")
>>>> "entity_compound2document_hepval" btree (hepval)
>>>> "entity_compound2document_name" btree (name)
>>>> "entity_compound2document_nephval" btree (nephval)
>>>> "entity_compound2document_patterncount" btree ("patternCount")
>>>> "entity_compound2document_phosval" btree (phosval)
>>>> "entity_compound2document_rulescore" btree ("ruleScore")
>>>> Has OIDs: no
>>>>
>>>> tablename | indexname | num_rows | table_size | index_size | unique | number_of_scans | tuples_read | tuples_fetched
>>>> entity_compounddict2document | entity_compound2document_cardval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_heptermnormscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_heptermvarscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_hepval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_name | 5.42452e+07 | 6763 MB | 1505 MB | Y | 24 | 178680 | 0
>>>> entity_compounddict2document | entity_compound2document_nephval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_patterncount | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_phosval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_rulescore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compounddict2document_pkey | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>>
>>>> The table has aprox. 54,000,000 rows
>>>> There are no NULLs in hepval field and pg_settings haven't changed. I also have done "analyze" to this table.
>>>>
>>>> I have simplified the query and added the last advise that you told me:
>>>>
>>>> Query:
>>>>
>>>> explain analyze select * from (select * from entity_compounddict2document where name='ranitidine') as a order by a.hepval;
>>>> QUERY PLAN
>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------
>>>> Sort (cost=11060.50..11067.55 rows=2822 width=133) (actual time=32715.097..32716.488 rows=13512 loops=1)
>>>> Sort Key: entity_compounddict2document.hepval
>>>> Sort Method: quicksort Memory: 2301kB
>>>> -> Bitmap Heap Scan on entity_compounddict2document (cost=73.82..10898.76 rows=2822 width=133) (actual time=6.034..32695.483 rows=13512 loops=1)
>>>> Recheck Cond: ((name)::text = 'ranitidine'::text)
>>>> -> Bitmap Index Scan on entity_compound2document_name (cost=0.00..73.12 rows=2822 width=0) (actual time=3.221..3.221 rows=13512 loops=1)
>>>> Index Cond: ((name)::text = 'ranitidine'::text)
>>>> Total runtime: 32717.548 ms
>>>>
>>>> Another query:
>>>> explain analyze select * from (select * from entity_compounddict2document where name='progesterone' ) as a order by a.hepval;
>>>>
>>>> QUERY PLAN
>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>> Sort (cost=367879.25..368209.24 rows=131997 width=133) (actual time=9262.887..9287.046 rows=138165 loops=1)
>>>> Sort Key: entity_compounddict2document.hepval
>>>> Sort Method: quicksort Memory: 25622kB
>>>> -> Bitmap Heap Scan on entity_compounddict2document (cost=2906.93..356652.81 rows=131997 width=133) (actual time=76.316..9038.485 rows=138165 loops=1)
>>>> Recheck Cond: ((name)::text = 'progesterone'::text)
>>>> -> Bitmap Index Scan on entity_compound2document_name (cost=0.00..2873.93 rows=131997 width=0) (actual time=40.913..40.913 rows=138165 loops=1)
>>>> Index Cond: ((name)::text = 'progesterone'::text)
>>>> Total runtime: 9296.815 ms
>>>>
>>>>
>>>> It has improved (I supose because of the lack of the join table) but still taking a lot of time... Anything I can do??
>>>>
>>>> Any help would be very appreciated. Thank you very much.
>>>
>>>
>>>
>>> Good to know performance has increased.
>>>
>>> "entity_compounddict2document" table goes through high INSERTS ?
>>>
>>> Can you help us know if the "helpval" column and "name" column have high duplicate values ? "n_distinct" value from pg_stats table would have that info.
>>>
>>> Below could be a possible workaround -
>>>
>>> As mentioned earlier in this email, a composite Index on name and hepval column might help. If the table does not go through lot of INSERTS, then consider performing a CLUSTER on the table using the same INDEX.
>>>
>>> Other recommendations -
>>>
>>> Please drop off all the Non-primary key Indexes which have 0 scans / hits. This would harm the DB and the DB server whilst maintenance and DML operations.
>>>
>>> Regards,
>>> Venkata Balaji N
>>>
>>> Fujitsu Australia
>>
>>
>>
>> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.
>>
>> **CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.
>>
>

Hi I think the problem is th heap scan of the table , that the backend have to do because the btree to bitmap conversion becomes lossy. Try to disable the enable_bitmapscan for the current session and rerun the query.
Mat Dba

**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.

**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.

**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.

**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.

Re: Query taking long time

From

"acanada"

Date:

19 March 2014, 11:09:29

Hello,

First of all I'd like to thank all of you for taking your time and help me with this. Thank you very much.

I did migrate the database to the new server with 32 processors Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz and 60GB of RAM.

Evegeny pointed that the disks I am using are not fast enough (For data: 00:1f.2 RAID bus controller: Intel Corporation C600/X79 series chipset SATA RAID Controller (rev 05); and for logging a SAS disk but with only 240GB available, database is 365GB...). I cannot change the locations of data and log since there's not enough space for the data in the SAS disk. Sadly this is a problem that I cannot solve any time soon...

The migration had really improved the performance

I paste the before and after (the migration) explain analyze, buffers(if aplicable due to server versions)

BEFORE:

explain analyze select * from (select * from entity2document2 where name='Acetaminophen' ) as a order by a.hepval;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------

Sort (cost=18015.66..18027.15 rows=4595 width=139) (actual time=39755.942..39756.246 rows=2845 loops=1)

Sort Key: entity2document2.hepval

Sort Method: quicksort Memory: 578kB

-> Bitmap Heap Scan on entity2document2 (cost=116.92..17736.15 rows=4595 width=139) (actual time=45.682..39751.255 rows=2845 loops=1)

Recheck Cond: ((name)::text = 'Acetaminophen'::text)

-> Bitmap Index Scan on entity2document2_name (cost=0.00..115.77 rows=4595 width=0) (actual time=45.124..45.124 rows=2845 loops=1)

Index Cond: ((name)::text = 'Acetaminophen'::text)

Total runtime: 39756.507 ms

AFTER:

explain (analyze,buffers) select * from (select * from entity2document2 where name='Acetaminophen' ) as a order by a.hepval;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------

Sort (cost=18434.76..18446.51 rows=4701 width=131) (actual time=9196.634..9196.909 rows=2845 loops=1)

Sort Key: entity2document2.hepval

Sort Method: quicksort Memory: 604kB

Buffers: shared hit=4 read=1725

-> Bitmap Heap Scan on entity2document2 (cost=105.00..18148.03 rows=4701 width=131) (actual time=38.668..9190.318 rows=2845 loops=1)

Recheck Cond: ((name)::text = 'Acetaminophen'::text)

Buffers: shared hit=4 read=1725

-> Bitmap Index Scan on entity2documentnew_name (cost=0.00..103.82 rows=4701 width=0) (actual time=30.905..30.905 rows=2845 loops=1)

Index Cond: ((name)::text = 'Acetaminophen'::text)

Buffers: shared hit=1 read=14

Total runtime: 9197.186 ms

The improve is definitely good!!.

This is the table that I'm using:

\d+ entity2document2;

Table "public.entity2document2"

------------------+--------------------------------+-----------+----------+--------------+-------------

Indexes:

"ent_pkey" PRIMARY KEY, btree (id)

"ent_cardval" btree (cardval)

"ent_document_id" btree (document_id)

"ent_heptermnormscore" btree ("hepTermNormScore")

"ent_heptermvarscore" btree ("hepTermVarScore")

"ent_hepval" btree (hepval)

"ent_name" btree (name)

"ent_nephval" btree (nephval)

"ent_patterncount" btree ("patternCount")

"ent_phosval" btree (phosval)

"ent_qualifier" btree (qualifier)

"ent_qualifier_name" btree (qualifier, name)

"ent_rulescore" btree ("ruleScore")

"ent_svm_confidence_index" btree ("svmConfidence")

And this are my current_settings

name | current_setting | source

----------------------------+--------------------+----------------------

application_name | psql | client

client_encoding | UTF8 | client

DateStyle | ISO, MDY | configuration file

default_text_search_config | pg_catalog.english | configuration file

effective_cache_size | 45000MB | configuration file

lc_messages | en_US.UTF-8 | configuration file

lc_monetary | en_US.UTF-8 | configuration file

lc_numeric | en_US.UTF-8 | configuration file

lc_time | en_US.UTF-8 | configuration file

listen_addresses | * | configuration file

log_timezone | Europe/Madrid | configuration file

logging_collector | on | configuration file

maintenance_work_mem | 4000MB | configuration file

max_connections | 100 | configuration file

max_stack_depth | 2MB | environment variable

shared_buffers | 10000MB | configuration file

TimeZone | Europe/Madrid | configuration file

work_mem | 32MB | configuration file

The size of the table is 41 GB and some statistics:

entity2document2 | 89 MB | 9.33479e+07 | 14 | Y | 13 | 1

I'm doing right now the CLUSTER on the table using the name+hepval multiple index as Venkata told me and will post you if it works.

Anyway, even though the improvement is important, I'd like an increase of the performance. When the number of rows returned is high, the performance decreases too much..

If anyone have any idea...

Best regards,

Andrés

El Mar 12, 2014, a las 12:12 AM, Evgeny Shishkin escribió:

Hello,

new server with more ram will definitely help to keep your working set in memory.
But if you want your queries be fast on cold (on disk) data, then you need more/faster disks.

And work_mem = 1000MB is too much, better set to 32MB so you don’t get OOM Killer.
And may be slightly lower shared_buffers.

On 11 Mar 2014, at 18:56, acanada <acanada@cnio.es> wrote:

Hello,

I cannot do explain (analyze, buffers) since I am on 8.3 postgres version.
I am migrating to the new server and upgrading it.
Once it is ready again I will post the explain query here.
The new disk is SATA disk with 5TB, raid 0 or 1...
lspci | grep -i raid
00:1f.2 RAID bus controller: Intel Corporation C600/X79 series chipset SATA RAID Controller (rev 05)

All database is 200GB and the table entity2document2 is

x=> select pg_size_pretty(pg_relation_size('entity2document2'));
pg_size_pretty
----------------
11 GB
(1 row)

x=> select pg_size_pretty(pg_total_relation_size('entity2document2'));
pg_size_pretty
----------------
29 GB
(1 row)

The index of the name column:
x=> select pg_size_pretty(pg_relation_size('entity2document2_name'));
pg_size_pretty
----------------
2550 MB
(1 row)

I am tunning the new server with this parameters...
shared_buffers = 15000MB
work_mem = 1000MB
maintenance_work_mem = 2000MB

Any other parameter that should be modified?

Thank you for your help!
Andrés

El Mar 10, 2014, a las 9:22 PM, desmodemone escribió:

Hello Andres,
with enable_bitmapscan=off; could you do :

explain ( analyze , buffers ) select * from entity2document2 where name='ranitidine' ;

I think it's interesting to understand how much it's clustered the table entity2document2.
infact the query extract 13512 rows in 79945.362 ms around 4 ms for row, and I suspect the table is not well clustered on that column, so every time the
process is asking for a different page of the table or the i/o system have some problem.

Moreover, another point it's : how much it's big ? the rows are arounf 94M , but how much it's big ? it's important the average row length

Have a nice day

2014-03-06 15:45 GMT+01:00 acanada <acanada@cnio.es>:
Hello Mat,

Setting enable_bitmapscan to off doesn't really helps. It gets worse...

x=> SET enable_bitmapscan=off;
SET
x=> explain analyze select * from (select * from entity2document2 where name='ranitidine' ) as a order by a.hepval;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=18789.21..18800.70 rows=4595 width=131) (actual time=79965.282..79966.657 rows=13512 loops=1)
Sort Key: entity2document2.hepval
Sort Method: quicksort Memory: 2301kB
-> Index Scan using entity2document2_name on entity2document2 (cost=0.00..18509.70 rows=4595 width=131) (actual time=67.507..79945.362 rows=13512 loops=1)
Index Cond: ((name)::text = 'ranitidine'::text)
Total runtime: 79967.705 ms
(6 rows)

Any other idea?

Thank you very much for your help. Regards,
Andrés

El Mar 6, 2014, a las 2:11 PM, desmodemone escribió:

Il 05/mar/2014 00:36 "Venkata Balaji Nagothi" <vbnpgc@gmail.com> ha scritto:
>
> After looking at the distinct values, yes the composite Index on "name" and "hepval" is not recommended. That would worsen - its expected.
>
> We need to look for other possible work around. Please drop off the above Index. Let me see if i can drill further into this.
>
> Meanwhile - can you help us know the memory parameters (work_mem, temp_buffers etc) set ?
>
> Do you have any other processes effecting this query's performance ?
>
> Any info about your Disk, RAM, CPU would also help.
>
> Regards,
> Venkata Balaji N
>
> Fujitsu Australia
>
>
>
>
> Venkata Balaji N
>
> Sr. Database Administrator
> Fujitsu Australia
>
>
> On Tue, Mar 4, 2014 at 10:23 PM, acanada <acanada@cnio.es> wrote:
>>
>> Hello,
>>
>> I don't know if this helps to figure out what is the problem but after adding the multicolumn index on name and hepval, the performance is even worse (¿?). Ten times worse...
>>
>> explain analyze select * from (select * from entity_compounddict2document where name='progesterone') as a order by a.hepval;
>> QUERY PLAN
>> -------------------------------------------------------------------------------------------------------------------------------------------------------------
>> Sort (cost=422746.18..423143.94 rows=159104 width=133) (actual time=95769.674..95797.943 rows=138165 loops=1)
>> Sort Key: entity_compounddict2document.hepval
>> Sort Method: quicksort Memory: 25622kB
>> -> Bitmap Heap Scan on entity_compounddict2document (cost=3501.01..408999.90 rows=159104 width=133) (actual time=70.789..95519.258 rows=138165 loops=1)
>> Recheck Cond: ((name)::text = 'progesterone'::text)
>> -> Bitmap Index Scan on entity_compound2document_name (cost=0.00..3461.23 rows=159104 width=0) (actual time=35.174..35.174 rows=138165 loops=1)
>> Index Cond: ((name)::text = 'progesterone'::text)
>> Total runtime: 95811.838 ms
>> (8 rows)
>>
>> Any ideas please?
>>
>> Thank you
>> Andrés.
>>
>>
>>
>> El Mar 4, 2014, a las 12:28 AM, Venkata Balaji Nagothi escribió:
>>
>>> On Mon, Mar 3, 2014 at 9:17 PM, acanada <acanada@cnio.es> wrote:
>>>>
>>>> Hello,
>>>>
>>>> Thankyou for your answer.
>>>> I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in order to avoid the join clause. Now the schema is very simple. The query only implies one table:
>>>>
>>>> x=> \d+ entity_compounddict2document;
>>>> Table "public.entity_compounddict2document"
>>>> Column | Type | Modifiers | Storage | Description
>>>> ------------------+--------------------------------+-----------+----------+-------------
>>>> id | integer | not null | plain |
>>>> document_id | integer | | plain |
>>>> name | character varying(255) | | extended |
>>>> qualifier | character varying(255) | | extended |
>>>> tagMethod | character varying(255) | | extended |
>>>> created | timestamp(0) without time zone | | plain |
>>>> updated | timestamp(0) without time zone | | plain |
>>>> curation | integer | | plain |
>>>> hepval | double precision | | plain |
>>>> cardval | double precision | | plain |
>>>> nephval | double precision | | plain |
>>>> phosval | double precision | | plain |
>>>> patternCount | double precision | | plain |
>>>> ruleScore | double precision | | plain |
>>>> hepTermNormScore | double precision | | plain |
>>>> hepTermVarScore | double precision | | plain |
>>>> Indexes:
>>>> "entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
>>>> "entity_compound2document_cardval" btree (cardval)
>>>> "entity_compound2document_heptermnormscore" btree ("hepTermNormScore")
>>>> "entity_compound2document_heptermvarscore" btree ("hepTermVarScore")
>>>> "entity_compound2document_hepval" btree (hepval)
>>>> "entity_compound2document_name" btree (name)
>>>> "entity_compound2document_nephval" btree (nephval)
>>>> "entity_compound2document_patterncount" btree ("patternCount")
>>>> "entity_compound2document_phosval" btree (phosval)
>>>> "entity_compound2document_rulescore" btree ("ruleScore")
>>>> Has OIDs: no
>>>>
>>>> tablename | indexname | num_rows | table_size | index_size | unique | number_of_scans | tuples_read | tuples_fetched
>>>> entity_compounddict2document | entity_compound2document_cardval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_heptermnormscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_heptermvarscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_hepval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_name | 5.42452e+07 | 6763 MB | 1505 MB | Y | 24 | 178680 | 0
>>>> entity_compounddict2document | entity_compound2document_nephval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_patterncount | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_phosval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compound2document_rulescore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>> entity_compounddict2document | entity_compounddict2document_pkey | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
>>>>
>>>> The table has aprox. 54,000,000 rows
>>>> There are no NULLs in hepval field and pg_settings haven't changed. I also have done "analyze" to this table.
>>>>
>>>> I have simplified the query and added the last advise that you told me:
>>>>
>>>> Query:
>>>>
>>>> explain analyze select * from (select * from entity_compounddict2document where name='ranitidine') as a order by a.hepval;
>>>> QUERY PLAN
>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------
>>>> Sort (cost=11060.50..11067.55 rows=2822 width=133) (actual time=32715.097..32716.488 rows=13512 loops=1)
>>>> Sort Key: entity_compounddict2document.hepval
>>>> Sort Method: quicksort Memory: 2301kB
>>>> -> Bitmap Heap Scan on entity_compounddict2document (cost=73.82..10898.76 rows=2822 width=133) (actual time=6.034..32695.483 rows=13512 loops=1)
>>>> Recheck Cond: ((name)::text = 'ranitidine'::text)
>>>> -> Bitmap Index Scan on entity_compound2document_name (cost=0.00..73.12 rows=2822 width=0) (actual time=3.221..3.221 rows=13512 loops=1)
>>>> Index Cond: ((name)::text = 'ranitidine'::text)
>>>> Total runtime: 32717.548 ms
>>>>
>>>> Another query:
>>>> explain analyze select * from (select * from entity_compounddict2document where name='progesterone' ) as a order by a.hepval;
>>>>
>>>> QUERY PLAN
>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>> Sort (cost=367879.25..368209.24 rows=131997 width=133) (actual time=9262.887..9287.046 rows=138165 loops=1)
>>>> Sort Key: entity_compounddict2document.hepval
>>>> Sort Method: quicksort Memory: 25622kB
>>>> -> Bitmap Heap Scan on entity_compounddict2document (cost=2906.93..356652.81 rows=131997 width=133) (actual time=76.316..9038.485 rows=138165 loops=1)
>>>> Recheck Cond: ((name)::text = 'progesterone'::text)
>>>> -> Bitmap Index Scan on entity_compound2document_name (cost=0.00..2873.93 rows=131997 width=0) (actual time=40.913..40.913 rows=138165 loops=1)
>>>> Index Cond: ((name)::text = 'progesterone'::text)
>>>> Total runtime: 9296.815 ms
>>>>
>>>>
>>>> It has improved (I supose because of the lack of the join table) but still taking a lot of time... Anything I can do??
>>>>
>>>> Any help would be very appreciated. Thank you very much.
>>>
>>>
>>>
>>> Good to know performance has increased.
>>>
>>> "entity_compounddict2document" table goes through high INSERTS ?
>>>
>>> Can you help us know if the "helpval" column and "name" column have high duplicate values ? "n_distinct" value from pg_stats table would have that info.
>>>
>>> Below could be a possible workaround -
>>>
>>> As mentioned earlier in this email, a composite Index on name and hepval column might help. If the table does not go through lot of INSERTS, then consider performing a CLUSTER on the table using the same INDEX.
>>>
>>> Other recommendations -
>>>
>>> Please drop off all the Non-primary key Indexes which have 0 scans / hits. This would harm the DB and the DB server whilst maintenance and DML operations.
>>>
>>> Regards,
>>> Venkata Balaji N
>>>
>>> Fujitsu Australia
>>
>>
>>
>> **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.
>>
>> **CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.
>>
>

Hi I think the problem is th heap scan of the table , that the backend have to do because the btree to bitmap conversion becomes lossy. Try to disable the enable_bitmapscan for the current session and rerun the query.
Mat Dba

**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.

**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.

**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.

**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.

Re: Query taking long time

From

Venkata Balaji Nagothi

Date:

19 March 2014, 23:30:43

On Wed, Mar 19, 2014 at 10:09 PM, acanada <acanada@cnio.es> wrote:

Hello,

First of all I'd like to thank all of you for taking your time and help me with this. Thank you very much.

I did migrate the database to the new server with 32 processors Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz and 60GB of RAM.
Evegeny pointed that the disks I am using are not fast enough (For data: 00:1f.2 RAID bus controller: Intel Corporation C600/X79 series chipset SATA RAID Controller (rev 05); and for logging a SAS disk but with only 240GB available, database is 365GB...). I cannot change the locations of data and log since there's not enough space for the data in the SAS disk. Sadly this is a problem that I cannot solve any time soon...

The migration had really improved the performance
I paste the before and after (the migration) explain analyze, buffers(if aplicable due to server versions)

BEFORE:
explain analyze select * from (select * from entity2document2 where name='Acetaminophen' ) as a order by a.hepval;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=18015.66..18027.15 rows=4595 width=139) (actual time=39755.942..39756.246 rows=2845 loops=1)
Sort Key: entity2document2.hepval
Sort Method: quicksort Memory: 578kB
-> Bitmap Heap Scan on entity2document2 (cost=116.92..17736.15 rows=4595 width=139) (actual time=45.682..39751.255 rows=2845 loops=1)
Recheck Cond: ((name)::text = 'Acetaminophen'::text)
-> Bitmap Index Scan on entity2document2_name (cost=0.00..115.77 rows=4595 width=0) (actual time=45.124..45.124 rows=2845 loops=1)
Index Cond: ((name)::text = 'Acetaminophen'::text)
Total runtime: 39756.507 ms

AFTER:
explain (analyze,buffers) select * from (select * from entity2document2 where name='Acetaminophen' ) as a order by a.hepval;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=18434.76..18446.51 rows=4701 width=131) (actual time=9196.634..9196.909 rows=2845 loops=1)
Sort Key: entity2document2.hepval
Sort Method: quicksort Memory: 604kB
Buffers: shared hit=4 read=1725
-> Bitmap Heap Scan on entity2document2 (cost=105.00..18148.03 rows=4701 width=131) (actual time=38.668..9190.318 rows=2845 loops=1)
Recheck Cond: ((name)::text = 'Acetaminophen'::text)
Buffers: shared hit=4 read=1725
-> Bitmap Index Scan on entity2documentnew_name (cost=0.00..103.82 rows=4701 width=0) (actual time=30.905..30.905 rows=2845 loops=1)
Index Cond: ((name)::text = 'Acetaminophen'::text)
Buffers: shared hit=1 read=14
Total runtime: 9197.186 ms

The improve is definitely good!!.
This is the table that I'm using:
\d+ entity2document2;
Table "public.entity2document2"
Column | Type | Modifiers | Storage | Stats target | Description
------------------+--------------------------------+-----------+----------+--------------+-------------
id | integer | not null | plain | |
document_id | integer | | plain | |
name | character varying(255) | not null | extended | |
qualifier | character varying(255) | not null | extended | |
tagMethod | character varying(255) | | extended | |
created | timestamp(0) without time zone | not null | plain | |
updated | timestamp(0) without time zone | | plain | |
curation | integer | | plain | |
hepval | double precision | | plain | |
cardval | double precision | | plain | |
nephval | double precision | | plain | |
phosval | double precision | | plain | |
patternCount | double precision | | plain | |
ruleScore | double precision | | plain | |
hepTermNormScore | double precision | | plain | |
hepTermVarScore | double precision | | plain | |
svmConfidence | double precision | | plain | |
Indexes:
"ent_pkey" PRIMARY KEY, btree (id)
"ent_cardval" btree (cardval)
"ent_document_id" btree (document_id)
"ent_heptermnormscore" btree ("hepTermNormScore")
"ent_heptermvarscore" btree ("hepTermVarScore")
"ent_hepval" btree (hepval)
"ent_name" btree (name)
"ent_nephval" btree (nephval)
"ent_patterncount" btree ("patternCount")
"ent_phosval" btree (phosval)
"ent_qualifier" btree (qualifier)
"ent_qualifier_name" btree (qualifier, name)
"ent_rulescore" btree ("ruleScore")
"ent_svm_confidence_index" btree ("svmConfidence")

And this are my current_settings

name | current_setting | source
----------------------------+--------------------+----------------------
application_name | psql | client
client_encoding | UTF8 | client
DateStyle | ISO, MDY | configuration file
default_text_search_config | pg_catalog.english | configuration file
effective_cache_size | 45000MB | configuration file
lc_messages | en_US.UTF-8 | configuration file
lc_monetary | en_US.UTF-8 | configuration file
lc_numeric | en_US.UTF-8 | configuration file
lc_time | en_US.UTF-8 | configuration file
listen_addresses | * | configuration file
log_timezone | Europe/Madrid | configuration file
logging_collector | on | configuration file
maintenance_work_mem | 4000MB | configuration file
max_connections | 100 | configuration file
max_stack_depth | 2MB | environment variable
shared_buffers | 10000MB | configuration file
TimeZone | Europe/Madrid | configuration file
work_mem | 32MB | configuration file

The size of the table is 41 GB and some statistics:
relname | rows_in_bytes | num_rows | number_of_indexes | unique | single_column | multi_column
entity2document2 | 89 MB | 9.33479e+07 | 14 | Y | 13 | 1

I'm doing right now the CLUSTER on the table using the name+hepval multiple index as Venkata told me and will post you if it works.
Anyway, even though the improvement is important, I'd like an increase of the performance. When the number of rows returned is high, the performance decreases too much..

Sorry, i have not been following this since sometime now.

Hardware configuration is better now. You were running on 8.3.x, can you please help us know what version of Postgres is this ?

Did you collect latest statistics and performed VACUUM after migration ?

Can you get us the EXPLAIN plan for "select * from entity2document2 where name='Acetaminophen' ; " ?

Venkata Balaji N

Sr. Database Administrator

Fujitsu Australia

Re: Query taking long time

From

"acanada"

Date:

20 March 2014, 10:17:28

Hello,

New server postgres version is 9.3. I'm not sure if I collected latest statistics after migration, if you mean if the
current_settingsor analyze queries that I posted were collected after migration... yes (notice that there are analyze
querybefore migration and after migration, maybe I didn't illustrate right)  
Sorry for that. Reading the statistics collector manual, I see there are plenty of parameters, and I'm not sure which
oneof them are you interested in, or if there's a query to collect them... 


This is the explain for the query after clearing the cache (name of table has changed, not a mistake...)

explain analyze select * from entity2document  where name='Acetaminophen';
                                                                 QUERY PLAN
              


--------------------------------------------------------------------------------------------------------------------------------------
-------
 Bitmap Heap Scan on entity2document  (cost=104.47..17914.96 rows=4632 width=138) (actual time=62.811..12208.446
rows=2845loops=1) 
   Recheck Cond: ((name)::text = 'Acetaminophen'::text)
   ->  Bitmap Index Scan on entity2document_name_index  (cost=0.00..103.31 rows=4632 width=0) (actual
time=34.357..34.357rows=2845 lo 
ops=1)
         Index Cond: ((name)::text = 'Acetaminophen'::text)
 Total runtime: 12216.115 ms
(5 rows)

It's much better now than with old server (39756.507 ms) however still high. I'd like to improve it...
Thank you very much.

Cheers,

Andrés

PS: Also notice that this is a query after denormalizing the database to avoid joins of very big tables. Once the
performanceis good enough I'd like to normalize it again if it's possible... :-) 






El Mar 20, 2014, a las 12:30 AM, Venkata Balaji Nagothi escribió:

> Sorry, i have not been following this since sometime now.
>
> Hardware configuration is better now. You were running on 8.3.x, can you please help us know what version of Postgres
isthis ? 
>
> Did you collect latest statistics and performed VACUUM after migration ?
>
> Can you get us the EXPLAIN plan for "select * from entity2document2  where name='Acetaminophen' ; " ?
>
> Venkata Balaji N
>
> Sr. Database Administrator
> Fujitsu Australia


**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información
protegidapara el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de
transmisiónpor parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega
comunicarloal remitente y borrar el mensaje recibido. 
**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged
informationfor the sole use of the designated recipient named above. Distribution, reproduction or any other use of
thistransmission by any party other than the intended recipient is prohibited. If you are not the intended recipient
pleasecontact the sender and delete all copies.

Re: Query taking long time

From

Venkata Balaji Nagothi

Date:

23 March 2014, 06:10:54

On Thu, Mar 20, 2014 at 9:17 PM, acanada <acanada@cnio.es> wrote:

Hello,

New server postgres version is 9.3. I'm not sure if I collected latest statistics after migration, if you mean if the current_settings or analyze queries that I posted were collected after migration... yes (notice that there are analyze query before migration and after migration, maybe I didn't illustrate right)
Sorry for that. Reading the statistics collector manual, I see there are plenty of parameters, and I'm not sure which one of them are you interested in, or if there's a query to collect them...

Hi Andres,

If we do not have statistics, its hard to arrive at any conclusion regarding performance. The cost numbers we get without statistics are not accurate.

Also, you have migrated across 5 major versions since 8.3. It is very important to have latest statistics in place.

Please perform VACUUM FULL and ANALYZE of the database.

Please post the EXPLAIN plan after that.

Thanks & Regards,

Venkata Balaji N

Fujitsu Australia