Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2 - Mailing list pgsql-performance

From Rodrigo Rosenfeld Rosas
Subject Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2
Date
Msg-id 509944DE.6020400@gmail.com
Whole thread Raw
Responses Re: Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2
Re: Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2
List pgsql-performance
Hello, this is my first message to this list, so sorry if this is not the right place to discuss this or if some data is missing from this message.

I'll gladly send any data you request that would help us to understand this issue. I don't believe I'm allowed to share the actual database dump, but other than that I can provide much more details you might ask for.

I can't understand why PG 9.2 performs so differently from PG 9.1.

I tested these queries in my Debian unstable amd64 box after restoring the same database dump this morning in both PG 9.1 (Debian unstable repository) and PG9.2 (Debian experimental repository) with same settings:

https://gist.github.com/3f1f3aad3847155e1e35

Ignore all lines like the line below because it doesn't make any difference on my tests if I just remove them or any other column from the SELECT clause:

"  exists(select id from condition_document_excerpt where condition_id=c1686.id) as v1686_has_reference,"

The results below are pretty much the same if you assume "SELECT 1 FROM ...".

I have proper indices created for all tables and the query  is fast in both PG versions when I don't use many conditions in the WHERE clause.

fast.sql returns the same data as slow.sql but it returns much faster in my tests with PG 9.1.

So here are the completion times for each query on each PG version:

Query   | PG 9.1         | PG 9.2 |
-----------------------------------
fast.sql| 650 ms (0.65s) | 690s   |
slow.sql| 419s           | 111s   |


For the curious, the results would be very similar to slow.sql if I use inner joins with the conditions inside the WHERE moved to the "ON" clause of the inner join instead of the left outer join + global WHERE approach. But I don't have this option anyway because this query is generated dynamically and not all my queries are "ALL"-like queries.

Here are the relevant indices (id is SERIAL primary key in all tables):

CREATE UNIQUE INDEX transaction_condition_transaction_id_type_id_idx
  ON transaction_condition
  USING btree
  (transaction_id, type_id);
CREATE INDEX index_transaction_condition_on_transaction_id
  ON transaction_condition
  USING btree
  (transaction_id);
CREATE INDEX index_transaction_condition_on_type_id
  ON transaction_condition
  USING btree
  (type_id);

CREATE INDEX acquirer_target_names
  ON company_transaction
  USING btree
  (acquiror_company_name COLLATE pg_catalog."default", target_company_name COLLATE pg_catalog."default");
CREATE INDEX index_company_transaction_on_target_company_name
  ON company_transaction
  USING btree
  (target_company_name COLLATE pg_catalog."default");
CREATE INDEX index_company_transaction_on_date
  ON company_transaction
  USING btree
  (date);
CREATE INDEX index_company_transaction_on_edit_status
  ON company_transaction
  USING btree
  (edit_status COLLATE pg_catalog."default");

CREATE UNIQUE INDEX index_condition_boolean_value_on_condition_id
  ON condition_boolean_value
  USING btree
  (condition_id);
CREATE INDEX index_condition_boolean_value_on_value_and_condition_id
  ON condition_boolean_value
  USING btree
  (value COLLATE pg_catalog."default", condition_id);

CREATE UNIQUE INDEX index_condition_option_value_on_condition_id
  ON condition_option_value
  USING btree
  (condition_id);
CREATE INDEX index_condition_option_value_on_value_id_and_condition_id
  ON condition_option_value
  USING btree
  (value_id, condition_id);


CREATE INDEX index_condition_option_label_on_type_id_and_position
  ON condition_option_label
  USING btree
  (type_id, "position");
CREATE INDEX index_condition_option_label_on_type_id_and_value
  ON condition_option_label
  USING btree
  (type_id, value COLLATE pg_catalog."default");


CREATE UNIQUE INDEX index_condition_string_value_on_condition_id
  ON condition_string_value
  USING btree
  (condition_id);
CREATE INDEX index_condition_string_value_on_value_and_condition_id
  ON condition_string_value
  USING btree
  (value COLLATE pg_catalog."default", condition_id);


Please let me know of any suggestions on how to try to get similar results in PG 9.2 as well as to understand why fast.sql performs so much better than slow.sql on PG 9.1.

Best,
Rodrigo.

pgsql-performance by date:

Previous
From: Denis
Date:
Subject: Re: [HACKERS] pg_dump and thousands of schemas
Next
From: Merlin Moncure
Date:
Subject: Re: Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2