Thread: Seemingly identical queries run at different speeds
Hello, I have three queries that are essentially identical. Two of them run in fractions of a second; one of them takes longer than 15 minutes to return. (Not sure how long it totally takes, that has been the limit of my patience.) The only difference between these queries is the use of table aliases in the sql query. What is happening in postgres that this makes such a difference? SQL queries below. Thanks, VanL FAST (0.017 second): select mm_batch.name as batch_name, mm_domain.name as domain_name, mm_management_unit.name as management_unit_name, mm_customer.firstname as customer_name, mm_legacy_account.username as old_username, mm_target_account.username as new_username from mm_batch, mm_domain, mm_management_unit, mm_customer, mm_legacy_account, mm_target_account where mm_domain.bid = mm_batch.id and mm_domain.mid = mm_management_unit.id and mm_domain.cid = mm_customer.id and mm_domain.lid = mm_legacy_account.id and mm_domain.tid = mm_target_account.id and mm_domain.name = 'example.com'; ======================================================== FAST (0.016 second): select B.name as batch_name, D.name as domain_name, M.name as management_unit_name, C.firstname as customer_name, LA.username as old_username, TA.username as new_username from mm_batch B, mm_domain D, mm_management_unit M, mm_customer C, mm_legacy_account LA, mm_target_account TA where D.bid = B.id and D.mid = M.id and D.cid = C.id and D.lid = LA.id and D.tid = TA.id and D.name = 'example.com'; ========================================================= SLOW ( > 15 minutes): select mm_batch.name as batch_name, mm_domain.name as domain_name, mm_management_unit.name as management_unit_name, mm_customer.firstname as customer_name, mm_legacy_account.username as old_username, mm_target_account.username as new_username from mm_batch B, mm_domain D, mm_management_unit M, mm_customer C, mm_legacy_account LA, mm_target_account TA where mm_domain.bid = mm_batch.id and mm_domain.mid = mm_management_unit.id and mm_domain.cid = mm_customer.id and mm_domain.lid = mm_legacy_account.id and mm_domain.tid = mm_target_account.id and mm_domain.name = 'example.com';
On Tue, 2003-07-22 at 19:27, VanL wrote: > I have three queries that are essentially identical. Two of them run in > fractions of a second; one of them takes longer than 15 minutes to > return. (Not sure how long it totally takes, that has been the limit of > my patience.) > > The only difference between these queries is the use of table aliases in > the sql query. What is happening in postgres that this makes such a > difference? You're getting bit by a 'feature' that is supposed to make life easier. PostgreSQL automatically adds tables you reference to the FROM clause if they're not listed there already, it then sends a NOTICE saying it did so. This is supposed to be helpful, here it certainly is not. What's happening in the below is you reference mm_batch, mm_domain, etc. which are _different_ than B, D, etc. because aliasing changes what query processor refers to them as (allowing stuff like self-joins). So Pg is appending them to the FROM clause, which now looks like this: FROM mm_batch B, mm_domain D, mm_management_unit M, mm_customer C, mm_legacy_account LA, mm_target_account TA, mm_batch, mm_domain, mm_management_unit, mm_customer, mm_legacy_account, mm_target_account Yikes. It's now generating an absolutely huge cartesian product between the result set you want and all those extra tables. I believe in the upcoming 7.4.x release this won't happen unless you enable ADD_MISSING_FROM in your conf. Something I certainly won't be enabling due to this type of situation. > SLOW ( > 15 minutes): > select > mm_batch.name as batch_name, > mm_domain.name as domain_name, > mm_management_unit.name as management_unit_name, > mm_customer.firstname as customer_name, > mm_legacy_account.username as old_username, > mm_target_account.username as new_username > from > mm_batch B, > mm_domain D, > mm_management_unit M, > mm_customer C, > mm_legacy_account LA, > mm_target_account TA > where > mm_domain.bid = mm_batch.id > and mm_domain.mid = mm_management_unit.id > and mm_domain.cid = mm_customer.id > and mm_domain.lid = mm_legacy_account.id > and mm_domain.tid = mm_target_account.id > and mm_domain.name = 'example.com';
On Tue, Jul 22, 2003 at 05:27:37PM -0600, VanL wrote: > I have three queries that are essentially identical. Two of them run in > fractions of a second; one of them takes longer than 15 minutes to > return. (Not sure how long it totally takes, that has been the limit of > my patience.) > > The only difference between these queries is the use of table aliases in > the sql query. What is happening in postgres that this makes such a > difference? I think the third query is adding items to the FROM clause that you probably don't want. The result is a cartesian product, which is going to take eons to materialize (i.e. give results back). See: http://www14.us.postgresql.org/docs/7.3/static/sql-select.html#R2-SQL-SELECT-4 -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "Estoy de acuerdo contigo en que la verdad absoluta no existe... El problema es que la mentira sí existe y tu estás mintiendo" (G. Lama)