Thread: BUG #10816: explain analyze reports different rows counts before and after sort
BUG #10816: explain analyze reports different rows counts before and after sort
From
bashtanov@imap.cc
Date:
The following bug has been logged on the website: Bug reference: 10816 Logged by: Alexey Bashtanov Email address: bashtanov@imap.cc PostgreSQL version: 9.3.4 Operating system: Ubuntu Linux 12.04 Description: Hello! Please have a look at the two bottom nodes of the plan. Function scan produces 1000 rows, then they are sorted and turn into 3201 rows! How could this happen? localhost/stat 13:31:19 > select version(); version ----------------------------------------------------------------------------------------------------- PostgreSQL 9.3.4 on i686-pc-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 32-bit (1 row) localhost/stat 13:31:21 > set enable_hashjoin to off; SET localhost/stat 13:31:22 > explain analyze select * from (select k % 5 b from generate_series(1,4000) k) l natural join generate_series(1,1000) b; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=119.66..227.16 rows=5000 width=4) (actual time=5.977..11.848 rows=3200 loops=1) Merge Cond: (((k.k % 5)) = b.b) -> Sort (cost=59.83..62.33 rows=1000 width=4) (actual time=4.571..5.825 rows=4000 loops=1) Sort Key: ((k.k % 5)) Sort Method: quicksort Memory: 221kB -> Function Scan on generate_series k (cost=0.00..10.00 rows=1000 width=4) (actual time=0.732..2.737 rows=4000 loops=1) -> Sort (cost=59.83..62.33 rows=1000 width=4) (actual time=0.878..1.942 rows=3201 loops=1) Sort Key: b.b Sort Method: quicksort Memory: 40kB -> Function Scan on generate_series b (cost=0.00..10.00 rows=1000 width=4) (actual time=0.193..0.528 rows=1000 loops=1) Total runtime: 12.814 ms (11 rows) Regards, Alexey Bashtanov
Re: BUG #10816: explain analyze reports different rows counts before and after sort
From
Tom Lane
Date:
bashtanov@imap.cc writes: > Please have a look at the two bottom nodes of the plan. Function scan > produces 1000 rows, then they are sorted and turn into 3201 rows! How could > this happen? The discrepancy is easily explained because the sort is the righthand child of a merge join. Merge join will rewind and rescan its right child every time the left child produces duplicate keys, so that the appropriate right-child rows get joined to all the similarly-keyed rows from the left child. And you do have duplicate outputs coming from that left child ... IOW, it's just a measurement artifact arising from re-fetching the same rows repeatedly. I think there's a warning about this in the docs. regards, tom lane