Re: COUNT(*) and index-only scans - Mailing list pgsql-hackers

From Thom Brown
Subject Re: COUNT(*) and index-only scans
Date
Msg-id CAA-aLv75+ZAPKP1uyrLsK2N028u9aCjOrwTSSVbESqnmz5oz+A@mail.gmail.com
Whole thread Raw
In response to Re: COUNT(*) and index-only scans  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: COUNT(*) and index-only scans
List pgsql-hackers
On 19 November 2011 16:08, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thom Brown <thom@linux.com> writes:
>> So is there a chance of getting bitmap index-only scans?
>
> Don't hold your breath.  It seems like a huge increment of complexity
> for probably very marginal gains.  The point of a bitmap scan (as
> opposed to regular indexscan) is to reduce heap accesses by combining
> visits to the same page; but it's not clear how useful that is if
> you're not making heap accesses at all.

Well consider:

pgbench=# explain analyse select count(*) from pgbench_accounts where
aid between 143243 and 374825 or aid between 523242 and 712111;
QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------Aggregate
(cost=83016.38..83016.39 rows=1 width=0) (actual 
time=1039.282..1039.282 rows=1 loops=1)  ->  Bitmap Heap Scan on pgbench_accounts  (cost=7934.54..81984.58
rows=412719 width=0) (actual time=243.012..977.946 rows=420453
loops=1)        Recheck Cond: (((aid >= 143243) AND (aid <= 374825)) OR ((aid
>= 523242) AND (aid <= 712111)))        ->  BitmapOr  (cost=7934.54..7934.54 rows=423802 width=0)
(actual time=228.934..228.934 rows=0 loops=1)              ->  Bitmap Index Scan on pgbench_accounts_pkey
(cost=0.00..4299.40 rows=235782 width=0) (actual time=134.410..134.410
rows=231583 loops=1)                    Index Cond: ((aid >= 143243) AND (aid <= 374825))              ->  Bitmap Index
Scanon pgbench_accounts_pkey 
(cost=0.00..3428.78 rows=188020 width=0) (actual time=94.520..94.520
rows=188870 loops=1)                    Index Cond: ((aid >= 523242) AND (aid <= 712111))Total runtime: 1039.598 ms
(9 rows)

Since I can't get this to use an index-only scan, it will always visit
the heap.  Instead I'd be forced to write:

pgbench=# explain analyse select count(*) from (select aid from
pgbench_accounts where aid between 143243 and 374825 union all select
aid from pgbench_accounts where aid between 523242 and 712111) x;
          QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Aggregate
(cost=17263.72..17263.73 rows=1 width=0) (actual 
time=232.053..232.053 rows=1 loops=1)  ->  Append  (cost=0.00..16204.22 rows=423802 width=0) (actual
time=10.925..195.134 rows=420453 loops=1)        ->  Subquery Scan on "*SELECT* 1"  (cost=0.00..9015.04
rows=235782 width=0) (actual time=10.925..90.116 rows=231583 loops=1)              ->  Index Only Scan using
pgbench_accounts_pkeyon 
pgbench_accounts  (cost=0.00..6657.22 rows=235782 width=4) (actual
time=10.924..61.822 rows=231583 loops=1)                    Index Cond: ((aid >= 143243) AND (aid <= 374825))        ->
Subquery Scan on "*SELECT* 2"  (cost=0.00..7189.18 
rows=188020 width=0) (actual time=0.062..62.953 rows=188870 loops=1)              ->  Index Only Scan using
pgbench_accounts_pkeyon 
pgbench_accounts  (cost=0.00..5308.98 rows=188020 width=4) (actual
time=0.061..40.343 rows=188870 loops=1)                    Index Cond: ((aid >= 523242) AND (aid <= 712111))Total
runtime:232.291 ms 
(9 rows)

These 2 queries are equal only because the ranges being checked don't
overlap, so if arbitrary values were being substituted, and the ranges
did happen to overlap, that last method couldn't work.  I'd have to
use a UNION ALL in that particular case, which adds a lot of overhead
due to de-duplication.

While I accept that maybe adapting the existing bitmap index scan
functionality isn't necessarily desirable, would it be feasible to
create a corresponding bitmap index-only scan method.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: COUNT(*) and index-only scans
Next
From: Greg Smith
Date:
Subject: Re: Core Extensions relocation