AGG_PLAIN thinks sorts are free - Mailing list pgsql-hackers

From Jeff Janes
Subject AGG_PLAIN thinks sorts are free
Date
Msg-id CAMkU=1yRW-_Ab=tW2uX1FY0A-kEjOJWiiFtx9oAx_F2P-nwFUw@mail.gmail.com
Whole thread Raw
Responses Re: AGG_PLAIN thinks sorts are free
List pgsql-hackers
AGG_PLAIN sometimes does sorts, but it thinks they are free.  Also, under explain analyze it does not explicitly report whether the sort was external or not, nor report the disk or memory usage, the way other sorts do.  I don't know if those two things are related or not.  

This behavior seems to be ancient, at least back to 8.4.

Does someone more familiar with this part of the code know if this is a simple oversight or a fundamental design issue?

Here is a test case, in which adding a "distinct" increases the run time 500% but doesn't change the estimate at all:

create table foo as select (random()*1000000)::int as val from generate_series(1,20000000);

analyze foo;


explain (analyze,buffers) select count(distinct val) from foo;
                                                       QUERY PLAN                                                        
-------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=338497.20..338497.21 rows=1 width=4) (actual time=28185.597..28185.598 rows=1 loops=1)
   Buffers: shared hit=192 read=88304, temp read=112326 written=112326
   I/O Timings: read=200.810
   ->  Seq Scan on foo  (cost=0.00..288496.96 rows=20000096 width=4) (actual time=0.040..2192.281 rows=20000000 loops=1)
         Buffers: shared hit=192 read=88304
         I/O Timings: read=200.810
 Total runtime: 28185.628 ms

explain (analyze,buffers) select count(val) from foo;
                                                       QUERY PLAN                                                        
-------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=338497.20..338497.21 rows=1 width=4) (actual time=4230.892..4230.892 rows=1 loops=1)
   Buffers: shared hit=224 read=88272
   I/O Timings: read=145.003
   ->  Seq Scan on foo  (cost=0.00..288496.96 rows=20000096 width=4) (actual time=0.098..2002.396 rows=20000000 loops=1)
         Buffers: shared hit=224 read=88272
         I/O Timings: read=145.003
 Total runtime: 4230.948 ms

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Differences in WHERE clause of SELECT
Next
From: Greg Smith
Date:
Subject: Re: [v9.4] row level security