Re: [Devel 9.2] Index-only scan in count aggregation - Mailing list pgsql-hackers
From | Emanuel Calvo |
---|---|
Subject | Re: [Devel 9.2] Index-only scan in count aggregation |
Date | |
Msg-id | CAGHEX6YnwFH82w=fJzfVHWBj6XEfLZCE7vCFRNx++p93mPP-eQ@mail.gmail.com Whole thread Raw |
In response to | Re: [Devel 9.2] Index-only scan in count aggregation (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: [Devel 9.2] Index-only scan in count aggregation
|
List | pgsql-hackers |
El día 18 de abril de 2012 18:17, Robert Haas <robertmhaas@gmail.com> escribió: > On Wed, Apr 18, 2012 at 12:13 PM, Emanuel Calvo <postgres.arg@gmail.com> wrote: >> Hi guys, >> >> I'm one of the nightly sources of 9.2devel. I was trying some simple >> queries and I realized something: >> >> stuff=# explain (analyze true, costs true, buffers true, timing true, >> verbose true) select count(i) from lot_of_values; >> QUERY >> PLAN >> ------------------------------------------------------------------------------------------------------------------------------------------- >> Aggregate (cost=213496.00..213496.01 rows=1 width=4) (actual >> time=60400.788..60400.791 rows=1 loops=1) >> Output: count(i) >> Buffers: shared hit=2400 read=86096 >> -> Seq Scan on public.lot_of_values (cost=0.00..188496.00 >> rows=10000000 width=4) (actual time=0.371..32227.791 rows=10000000 >> loops=1) >> Output: i, t1, r1, r2, r3, d1 >> Buffers: shared hit=2400 read=86096 >> Total runtime: 60402.460 ms >> (7 rows) >> >> stuff=# set enable_seqscan=off; >> SET >> stuff=# explain (analyze true, costs true, buffers true, timing true, >> verbose true) select count(i) from lot_of_values; >> >> QUERY PLAN >> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >> Aggregate (cost=351292.03..351292.04 rows=1 width=4) (actual >> time=63278.472..63278.475 rows=1 loops=1) >> Output: count(i) >> Buffers: shared hit=1 read=110379 >> -> Index Only Scan using lot_of_values_pkey on >> public.lot_of_values (cost=0.00..326292.03 rows=10000000 width=4) >> (actual time=42.028..35217.460 rows=10000000 loops=1) >> Output: i >> Heap Fetches: 10000000 >> Buffers: shared hit=1 read=110379 >> Total runtime: 63278.720 ms >> (8 rows) >> >> >> I know, still development. Just wanna know if there will be an >> improvement for this in the next patches or the idea is to maintain >> this behaviour. >> >> Cheers and thanks for the amazing work you all had done! > > I'm not sure what you're unhappy about. It seems that the query > planner picked the fastest plan (a sequential scan) and then when you > disabled that it picked the second-fastest plan (an index-only scan). > > The index-only scan would have a chance of beating the sequential scan > if the table had been recently vacuumed, but not in the case where > every row is going to require a heap fetch. > Oh, I see now. Honestly, I thought it wasn't necessary to make a heap fetch. The table doesn't have any modifications, but with the vacuum the cost changed. Checking the source code, I saw what you are talking about: /* * We can skip the heap fetch if the TID references a heap page on * which all tuples are known visible to everybody. In any case, * we'll use the index tuple not the heap tuple as the data source. */ Thanks for the information! -- -- Emanuel Calvo
pgsql-hackers by date: