Re: Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2 - Mailing list pgsql-performance

From ktm@rice.edu
Subject Re: Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2
Date
Msg-id 20121106180827.GG22303@aart.rice.edu
Whole thread Raw
In response to Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2  (Rodrigo Rosenfeld Rosas <rr.rosas@gmail.com>)
Responses Re: Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2  (Rodrigo Rosenfeld Rosas <rr.rosas@gmail.com>)
List pgsql-performance
Hi Rodrigo,

It looks like a lot of joins and 9.2 does some optimizations that
internally add additional joins. Did you try raising the
join_collapse_limit and maybe the from_collapse_limit from the
default values of 8?

Regards,
Ken

On Tue, Nov 06, 2012 at 03:11:58PM -0200, Rodrigo Rosenfeld Rosas wrote:
> Hello, this is my first message to this list, so sorry if this is
> not the right place to discuss this or if some data is missing from
> this message.
>
> I'll gladly send any data you request that would help us to
> understand this issue. I don't believe I'm allowed to share the
> actual database dump, but other than that I can provide much more
> details you might ask for.
>
> I can't understand why PG 9.2 performs so differently from PG 9.1.
>
> I tested these queries in my Debian unstable amd64 box after
> restoring the same database dump this morning in both PG 9.1 (Debian
> unstable repository) and PG9.2 (Debian experimental repository) with
> same settings:
>
> https://gist.github.com/3f1f3aad3847155e1e35
>
> Ignore all lines like the line below because it doesn't make any
> difference on my tests if I just remove them or any other column
> from the SELECT clause:
>
> "  exists(select id from condition_document_excerpt where
> condition_id=c1686.id) as v1686_has_reference,"
>
> The results below are pretty much the same if you assume "SELECT 1
> FROM ...".
>
> I have proper indices created for all tables and the query  is fast
> in both PG versions when I don't use many conditions in the WHERE
> clause.
>
> fast.sql returns the same data as slow.sql but it returns much
> faster in my tests with PG 9.1.
>
> So here are the completion times for each query on each PG version:
>
> Query   | PG 9.1         | PG 9.2 |
> -----------------------------------
> fast.sql| 650 ms (0.65s) | 690s   |
> slow.sql| 419s           | 111s   |
>
>
> For the curious, the results would be very similar to slow.sql if I
> use inner joins with the conditions inside the WHERE moved to the
> "ON" clause of the inner join instead of the left outer join +
> global WHERE approach. But I don't have this option anyway because
> this query is generated dynamically and not all my queries are
> "ALL"-like queries.
>
> Here are the relevant indices (id is SERIAL primary key in all tables):
>
> CREATE UNIQUE INDEX transaction_condition_transaction_id_type_id_idx
>   ON transaction_condition
>   USING btree
>   (transaction_id, type_id);
> CREATE INDEX index_transaction_condition_on_transaction_id
>   ON transaction_condition
>   USING btree
>   (transaction_id);
> CREATE INDEX index_transaction_condition_on_type_id
>   ON transaction_condition
>   USING btree
>   (type_id);
>
> CREATE INDEX acquirer_target_names
>   ON company_transaction
>   USING btree
>   (acquiror_company_name COLLATE pg_catalog."default",
> target_company_name COLLATE pg_catalog."default");
> CREATE INDEX index_company_transaction_on_target_company_name
>   ON company_transaction
>   USING btree
>   (target_company_name COLLATE pg_catalog."default");
> CREATE INDEX index_company_transaction_on_date
>   ON company_transaction
>   USING btree
>   (date);
> CREATE INDEX index_company_transaction_on_edit_status
>   ON company_transaction
>   USING btree
>   (edit_status COLLATE pg_catalog."default");
>
> CREATE UNIQUE INDEX index_condition_boolean_value_on_condition_id
>   ON condition_boolean_value
>   USING btree
>   (condition_id);
> CREATE INDEX index_condition_boolean_value_on_value_and_condition_id
>   ON condition_boolean_value
>   USING btree
>   (value COLLATE pg_catalog."default", condition_id);
>
> CREATE UNIQUE INDEX index_condition_option_value_on_condition_id
>   ON condition_option_value
>   USING btree
>   (condition_id);
> CREATE INDEX index_condition_option_value_on_value_id_and_condition_id
>   ON condition_option_value
>   USING btree
>   (value_id, condition_id);
>
>
> CREATE INDEX index_condition_option_label_on_type_id_and_position
>   ON condition_option_label
>   USING btree
>   (type_id, "position");
> CREATE INDEX index_condition_option_label_on_type_id_and_value
>   ON condition_option_label
>   USING btree
>   (type_id, value COLLATE pg_catalog."default");
>
>
> CREATE UNIQUE INDEX index_condition_string_value_on_condition_id
>   ON condition_string_value
>   USING btree
>   (condition_id);
> CREATE INDEX index_condition_string_value_on_value_and_condition_id
>   ON condition_string_value
>   USING btree
>   (value COLLATE pg_catalog."default", condition_id);
>
>
> Please let me know of any suggestions on how to try to get similar
> results in PG 9.2 as well as to understand why fast.sql performs so
> much better than slow.sql on PG 9.1.
>
> Best,
> Rodrigo.


pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] pg_dump and thousands of schemas
Next
From: Rodrigo Rosenfeld Rosas
Date:
Subject: Re: Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2