Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2 - Mailing list pgsql-performance
From | Rodrigo Rosenfeld Rosas |
---|---|
Subject | Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2 |
Date | |
Msg-id | 509944DE.6020400@gmail.com Whole thread Raw |
Responses |
Re: Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2
Re: Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2 |
List | pgsql-performance |
Hello, this is my first message to this list, so sorry if this is not the right place to discuss this or if some data is missing from this message.
I'll gladly send any data you request that would help us to understand this issue. I don't believe I'm allowed to share the actual database dump, but other than that I can provide much more details you might ask for.
I can't understand why PG 9.2 performs so differently from PG 9.1.
I tested these queries in my Debian unstable amd64 box after restoring the same database dump this morning in both PG 9.1 (Debian unstable repository) and PG9.2 (Debian experimental repository) with same settings:
https://gist.github.com/3f1f3aad3847155e1e35
Ignore all lines like the line below because it doesn't make any difference on my tests if I just remove them or any other column from the SELECT clause:
" exists(select id from condition_document_excerpt where condition_id=c1686.id) as v1686_has_reference,"
The results below are pretty much the same if you assume "SELECT 1 FROM ...".
I have proper indices created for all tables and the query is fast in both PG versions when I don't use many conditions in the WHERE clause.
fast.sql returns the same data as slow.sql but it returns much faster in my tests with PG 9.1.
So here are the completion times for each query on each PG version:
Query | PG 9.1 | PG 9.2 |
-----------------------------------
fast.sql| 650 ms (0.65s) | 690s |
slow.sql| 419s | 111s |
For the curious, the results would be very similar to slow.sql if I use inner joins with the conditions inside the WHERE moved to the "ON" clause of the inner join instead of the left outer join + global WHERE approach. But I don't have this option anyway because this query is generated dynamically and not all my queries are "ALL"-like queries.
Here are the relevant indices (id is SERIAL primary key in all tables):
CREATE UNIQUE INDEX transaction_condition_transaction_id_type_id_idx
ON transaction_condition
USING btree
(transaction_id, type_id);
CREATE INDEX index_transaction_condition_on_transaction_id
ON transaction_condition
USING btree
(transaction_id);
CREATE INDEX index_transaction_condition_on_type_id
ON transaction_condition
USING btree
(type_id);
CREATE INDEX acquirer_target_names
ON company_transaction
USING btree
(acquiror_company_name COLLATE pg_catalog."default", target_company_name COLLATE pg_catalog."default");
CREATE INDEX index_company_transaction_on_target_company_name
ON company_transaction
USING btree
(target_company_name COLLATE pg_catalog."default");
CREATE INDEX index_company_transaction_on_date
ON company_transaction
USING btree
(date);
CREATE INDEX index_company_transaction_on_edit_status
ON company_transaction
USING btree
(edit_status COLLATE pg_catalog."default");
CREATE UNIQUE INDEX index_condition_boolean_value_on_condition_id
ON condition_boolean_value
USING btree
(condition_id);
CREATE INDEX index_condition_boolean_value_on_value_and_condition_id
ON condition_boolean_value
USING btree
(value COLLATE pg_catalog."default", condition_id);
CREATE UNIQUE INDEX index_condition_option_value_on_condition_id
ON condition_option_value
USING btree
(condition_id);
CREATE INDEX index_condition_option_value_on_value_id_and_condition_id
ON condition_option_value
USING btree
(value_id, condition_id);
CREATE INDEX index_condition_option_label_on_type_id_and_position
ON condition_option_label
USING btree
(type_id, "position");
CREATE INDEX index_condition_option_label_on_type_id_and_value
ON condition_option_label
USING btree
(type_id, value COLLATE pg_catalog."default");
CREATE UNIQUE INDEX index_condition_string_value_on_condition_id
ON condition_string_value
USING btree
(condition_id);
CREATE INDEX index_condition_string_value_on_value_and_condition_id
ON condition_string_value
USING btree
(value COLLATE pg_catalog."default", condition_id);
Please let me know of any suggestions on how to try to get similar results in PG 9.2 as well as to understand why fast.sql performs so much better than slow.sql on PG 9.1.
Best,
Rodrigo.
I'll gladly send any data you request that would help us to understand this issue. I don't believe I'm allowed to share the actual database dump, but other than that I can provide much more details you might ask for.
I can't understand why PG 9.2 performs so differently from PG 9.1.
I tested these queries in my Debian unstable amd64 box after restoring the same database dump this morning in both PG 9.1 (Debian unstable repository) and PG9.2 (Debian experimental repository) with same settings:
https://gist.github.com/3f1f3aad3847155e1e35
Ignore all lines like the line below because it doesn't make any difference on my tests if I just remove them or any other column from the SELECT clause:
" exists(select id from condition_document_excerpt where condition_id=c1686.id) as v1686_has_reference,"
The results below are pretty much the same if you assume "SELECT 1 FROM ...".
I have proper indices created for all tables and the query is fast in both PG versions when I don't use many conditions in the WHERE clause.
fast.sql returns the same data as slow.sql but it returns much faster in my tests with PG 9.1.
So here are the completion times for each query on each PG version:
Query | PG 9.1 | PG 9.2 |
-----------------------------------
fast.sql| 650 ms (0.65s) | 690s |
slow.sql| 419s | 111s |
For the curious, the results would be very similar to slow.sql if I use inner joins with the conditions inside the WHERE moved to the "ON" clause of the inner join instead of the left outer join + global WHERE approach. But I don't have this option anyway because this query is generated dynamically and not all my queries are "ALL"-like queries.
Here are the relevant indices (id is SERIAL primary key in all tables):
CREATE UNIQUE INDEX transaction_condition_transaction_id_type_id_idx
ON transaction_condition
USING btree
(transaction_id, type_id);
CREATE INDEX index_transaction_condition_on_transaction_id
ON transaction_condition
USING btree
(transaction_id);
CREATE INDEX index_transaction_condition_on_type_id
ON transaction_condition
USING btree
(type_id);
CREATE INDEX acquirer_target_names
ON company_transaction
USING btree
(acquiror_company_name COLLATE pg_catalog."default", target_company_name COLLATE pg_catalog."default");
CREATE INDEX index_company_transaction_on_target_company_name
ON company_transaction
USING btree
(target_company_name COLLATE pg_catalog."default");
CREATE INDEX index_company_transaction_on_date
ON company_transaction
USING btree
(date);
CREATE INDEX index_company_transaction_on_edit_status
ON company_transaction
USING btree
(edit_status COLLATE pg_catalog."default");
CREATE UNIQUE INDEX index_condition_boolean_value_on_condition_id
ON condition_boolean_value
USING btree
(condition_id);
CREATE INDEX index_condition_boolean_value_on_value_and_condition_id
ON condition_boolean_value
USING btree
(value COLLATE pg_catalog."default", condition_id);
CREATE UNIQUE INDEX index_condition_option_value_on_condition_id
ON condition_option_value
USING btree
(condition_id);
CREATE INDEX index_condition_option_value_on_value_id_and_condition_id
ON condition_option_value
USING btree
(value_id, condition_id);
CREATE INDEX index_condition_option_label_on_type_id_and_position
ON condition_option_label
USING btree
(type_id, "position");
CREATE INDEX index_condition_option_label_on_type_id_and_value
ON condition_option_label
USING btree
(type_id, value COLLATE pg_catalog."default");
CREATE UNIQUE INDEX index_condition_string_value_on_condition_id
ON condition_string_value
USING btree
(condition_id);
CREATE INDEX index_condition_string_value_on_value_and_condition_id
ON condition_string_value
USING btree
(value COLLATE pg_catalog."default", condition_id);
Please let me know of any suggestions on how to try to get similar results in PG 9.2 as well as to understand why fast.sql performs so much better than slow.sql on PG 9.1.
Best,
Rodrigo.
pgsql-performance by date: