Re: demystifying nested loop vs. merge join query plan choice - Mailing list pgsql-general

From BladeOfLight16
Subject Re: demystifying nested loop vs. merge join query plan choice
Date
Msg-id CA+=1U=WP6v6xc3s-qTCxdaRpvVLOr6fVk_mn0fJDRMjqWRb7TA@mail.gmail.com
Whole thread Raw
In response to Re: demystifying nested loop vs. merge join query plan choice  (Sandeep Gupta <gupta.sandeep@gmail.com>)
Responses Re: demystifying nested loop vs. merge join query plan choice  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-general
On Thu, Aug 1, 2013 at 10:25 AM, Sandeep Gupta <gupta.sandeep@gmail.com> wrote:
@Jeff : Thanks for pointing this out. Turns out that was the case. 

@Tom: Thank you for the reference to random_page_cost parameters. It would be very useful for us. Would go through the rest of the documentation as well.

I can't say what Jeff mentioned; maybe he didn't reply to the user list. Anyhow, sorry if this is repeating information.

I cannot help but point something glaring out in the EXPLAIN, though:

database 1:
Index Only Scan using tc_did_idx on tc  (cost=0.00..1298125.32 rows=49987616 width=4)

database 2:
Index Only Scan using tc_did_idx on tc  (cost=0.00..70.44 rows=3 width=4)

Maybe I just don't know how to read EXPLAIN plans, but it would appear that the estimated rows from the index only scan in the two plans is different by a factor of about 16.7 million. database 1 also processes about 7.7 million rows before the aggregate, where database 2 only processes about 1.3 million. For some reason, it appears that database 2 is able to eliminate far more rows more quickly, resulting in a faster query. Have both databases had VACUUM ANALYZE run on them? Are the statistics collection settings the same?

pgsql-general by date:

Previous
From: BladeOfLight16
Date:
Subject: Re: Add a NOT NULL column with default only during add
Next
From: Adrian Klaver
Date:
Subject: Re: Add a NOT NULL column with default only during add