Thread: why is bitmap index chosen for this query?

why is bitmap index chosen for this query?

From
Stephen Byers
Date:
Could someone explain the results of the following?  This is with postgres 8.1.2 on a database that was just vacuum-verbose-analyzed.  I have packets_i4 index which I am expecting to be used with this query but as you can see, I have have to convince its usage by turning off other scans.  The total runtime is pretty drastic when the index is not chosen.  When using a cursor, the query using the index is the only one that provides immediate results.  Also, what is Recheck Cond?
 
adbs_db=#   \d packets
                        Table "public.packets"
         Column          |          Type          |     Modifiers     
-------------------------+------------------------+--------------------
 system_time_secs        | integer                | not null
 system_time_subsecs     | integer                | not null
 spacecraft_time_secs    | integer                | not null
 spacecraft_time_subsecs | integer                | not null
 mnemonic                | character varying(64)  |
 mnemonic_id             | integer                | not null
 data_length             | integer                | not null
 data                    | bytea                  | not null
 volume_label            | character varying(128) | not null
 tlm_version_name        | character varying(32)  | not null
 environment_name        | character varying(32)  | not null
 quality                 | integer                | not null default 0
Indexes:
    "packets_i1" btree (volume_label)
    "packets_i4" btree (environment_name, system_time_secs, system_time_subsecs, mnemonic)
    "packets_i5" btree (environment_name, spacecraft_time_secs, spacecraft_time_subsecs, mnemonic)
 
adbs_db=# explain analyze select spacecraft_time_secs,mnemonic,volume_label
  from packets   where environment_name='PASITCTX01'
  and system_time_secs>=1132272000 and system_time_secs<=1143244800;

               QUERY PLAN                                                                   
-------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on packets  (cost=247201.41..2838497.72 rows=12472989 width=47) (actual time=573856.344..771866.516 rows=13365371 loops=1)
   Recheck Cond: (((environment_name)::text = 'PASITCTX01'::text) AND (system_time_secs >= 1132272000) AND (system_time_secs <= 1143244800))
   ->  Bitmap Index Scan on packets_i4  (cost=0.00..247201.41 rows=12472989 width=0) (actual time=573484.199..573484.199 rows=13365371 loops=1)
         Index Cond: (((environment_name)::text = 'PASITCTX01'::text) AND (system_time_secs >= 1132272000) AND (system_time_secs <= 1143244800))
 Total runtime: 777208.041 ms
(5 rows)
 
adbs_db=# set enable_bitmapscan to off;
SET
adbs_db=# explain analyze select spacecraft_time_secs,mnemonic,volume_label
  from packets   where environment_name='PASITCTX01'
  and system_time_secs>=1132272000 and system_time_secs<=1143244800;

                                                              QUERY PLAN                                                              
---------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on packets  (cost=0.00..3045957.30 rows=12472989 width=47) (actual time=58539.693..493056.015 rows=13365371 loops=1)
   Filter: (((environment_name)::text = 'PASITCTX01'::text) AND (system_time_secs >= 1132272000) AND (system_time_secs <= 1143244800))
 Total runtime: 498620.963 ms
(3 rows)
 
adbs_db=# set enable_seqscan to off;
SET
 
adbs_db=# explain analyze select spacecraft_time_secs,mnemonic,volume_label
  from packets   where environment_name='PASITCTX01'
  and system_time_secs>=1132272000 and system_time_secs<=1143244800;

            QUERY PLAN                                                                  
------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using packets_i4 on packets  (cost=0.00..19908567.85 rows=12472989 width=47) (actual time=47.691..206028.754 rows=13365371 loops=1)
   Index Cond: (((environment_name)::text = 'PASITCTX01'::text) AND (system_time_secs >= 1132272000) AND (system_time_secs <= 1143244800))
 Total runtime: 211644.843 ms
(3 rows)


Blab-away for as little as 1¢/min. Make PC-to-Phone Calls using Yahoo! Messenger with Voice.

Re: why is bitmap index chosen for this query?

From
"Steinar H. Gunderson"
Date:
On Thu, May 18, 2006 at 08:52:04AM -0700, Stephen Byers wrote:
> Could someone explain the results of the following?

It sounds like PostgreSQL badly overestimates the cost of the index scan.
Does the table perchance fit completely into memory, without
effective_cache_size indicating that?

> Also, what is Recheck Cond?

The bitmap index scan will by default allocate one bit per tuple. If it can't
hold a complete bitmap of every tuple in memory, it will fall back to
allocating one bit per (8 kB) page, since it will have to read the entire
page anyhow, and the most dramatic cost is the seek. However, in the latter
case, it will also get a few extra records that don't match the original
clause, so it will have to recheck the condition ("Recheck Cond") before
outputting the tuples to the parent node.

/* Steinar */
--
Homepage: http://www.sesse.net/

Re: why is bitmap index chosen for this query?

From
Stephen Byers
Date:
"Steinar H. Gunderson" <sgunderson@bigfoot.com> wrote:

It sounds like PostgreSQL badly overestimates the cost of the index scan.
Does the table perchance fit completely into memory, without
effective_cache_size indicating that?

Don't know the exact way to answer your question, but my initial instinct is "no way."
  select pg_relation_size('packets');
 pg_relation_size
------------------
      19440115712
19GB.  So it's a big table.  The query I submitted earlier returns about 13M rows and the table currently has about 38M rows.
 
 


Love cheap thrills? Enjoy PC-to-Phone calls to 30+ countries for just 2¢/min with Yahoo! Messenger with Voice.

Re: why is bitmap index chosen for this query?

From
"Steinar H. Gunderson"
Date:
On Thu, May 18, 2006 at 09:41:23AM -0700, Stephen Byers wrote:
> Does the table perchance fit completely into memory, without
> effective_cache_size indicating that?
>
>   Don't know the exact way to answer your question, but my initial instinct is "no way."

What about the working set? Have you tried running the queries multiple times
in a row to see if the results change? It might be that your initial bitmap
scan puts all the relevant bits into cache, which will skew the results.

/* Steinar */
--
Homepage: http://www.sesse.net/

Re: why is bitmap index chosen for this query?

From
Tom Lane
Date:
"Steinar H. Gunderson" <sgunderson@bigfoot.com> writes:
> What about the working set? Have you tried running the queries multiple times
> in a row to see if the results change? It might be that your initial bitmap
> scan puts all the relevant bits into cache, which will skew the results.

If the examples were done in the order shown, the seqscan ought to have
pretty well blown out the cache ... but I concur that it'd be
interesting to check whether repeated executions of the same plan show
markedly different times.

Also, is the index order closely correlated to the actual physical
table order?

What is work_mem set to, and does increasing it substantially make the
bitmap scan work any better?

Considering that the query is fetching about half of the table, I'd have
thought that the planner was correct to estimate that bitmap or seqscan
ought to win.  For the plain indexscan to win, the order correlation
must be quite strong, and I'm also guessing that the bitmap scan must
have run into some substantial trouble (like discarding a lot of info
because of lack of work_mem).

IIRC, the planner doesn't currently try to model the effects of a bitmap
scan going into lossy mode, which is something it probably should try to
account for.

            regards, tom lane

Re: why is bitmap index chosen for this query?

From
Stephen Byers
Date:
I repeated explain analyze on the query 5 times and it came up with the same plan.

You asked about index order and physical table order.  In general the index order is indeed close to the same order as the physical table order.  However, this query is likely an exception.  The data is actually from a backup server that has filled a hole for some of the time range that I'm specifying in my query.
 
Work_mem was set to 10240.  After your suggestion, I bumped it to 102400 and it looks like it did significantly impact performance. 
 
adbs_db=# explain analyze select spacecraft_time_secs,mnemonic,volume_label from packets
adbs_db-#   where environment_name='PASITCTX01'
adbs_db-#   and system_time_secs>=1132272000 and system_time_secs<=1143244800;
                                                                   QUERY PLAN                                                                   
-------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on packets  (cost=247205.64..2838556.55 rows=12473252 width=47) (actual time=32118.943..187075.742 rows=13365371 loops=1)
   Recheck Cond: (((environment_name)::text = 'PASITCTX01'::text) AND (system_time_secs >= 1132272000) AND (system_time_secs <= 1143244800))
   ->  Bitmap Index Scan on packets_i4  (cost=0.00..247205.64 rows=12473252 width=0) (actual time=30370.789..30370.789 rows=13365371 loops=1)
         Index Cond: (((environment_name)::text = 'PASITCTX01'::text) AND (system_time_secs >= 1132272000) AND (system_time_secs <= 1143244800))
 Total runtime: 191995.431 ms
(5 rows)
adbs_db=#
adbs_db=#
adbs_db=# adbs_db=# set enable_bitmapscan to off;
SET
adbs_db=# explain analyze select spacecraft_time_secs,mnemonic,volume_label from packets
adbs_db-#   where environment_name='PASITCTX01'
adbs_db-#   and system_time_secs>=1132272000 and system_time_secs<=1143244800;
                                                              QUERY PLAN                                                              
---------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on packets  (cost=0.00..3046021.47 rows=12473252 width=47) (actual time=56616.457..475839.789 rows=13365371 loops=1)
   Filter: (((environment_name)::text = 'PASITCTX01'::text) AND (system_time_secs >= 1132272000) AND (system_time_secs <= 1143244800))
 Total runtime: 481228.409 ms
(3 rows)
adbs_db=#
adbs_db=#
adbs_db=# adbs_db=# set enable_seqscan to off;
SET
adbs_db=# explain analyze select spacecraft_time_secs,mnemonic,volume_label from packets
adbs_db-#   where environment_name='PASITCTX01'
adbs_db-#   and system_time_secs>=1132272000 and system_time_secs<=1143244800;
                                                                  QUERY PLAN                                                                  
-----------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using packets_i4 on packets  (cost=0.00..19909080.77 rows=12473273 width=47) (actual time=3.511..188273.177 rows=13365371 loops=1)
   Index Cond: (((environment_name)::text = 'PASITCTX01'::text) AND (system_time_secs >= 1132272000) AND (system_time_secs <= 1143244800))
 Total runtime: 194061.497 ms
(3 rows)
 
 
 
Wow -- so what does that mean?  Do I need to leave my work_mem at 100MB??  I mentioned that my application actually uses a cursor to walk through this data.  Even though the bitmap scan technically had the fastest time with explain analyze, it takes a long while (20 seconds) before the results start to come back through the cursor.  Conversely, with the index scan, results immediately come back through the cursor method (which is more desirable).  Thoughts? 
 
Example:
begin;
declare myCursor cursor for
  select spacecraft_time_secs,mnemonic,volume_label from packets
  where environment_name='PASITCTX01'
  and system_time_secs>=1132272000
  and system_time_secs<=1143244800;
fetch 10 from myCursor;
end;
 
PS, this is on a Sun Fire V240 with 4GB RAM, Solaris 8.
 
Thanks,
Steve
 

Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Steinar H. Gunderson" writes:
> What about the working set? Have you tried running the queries multiple times
> in a row to see if the results change? It might be that your initial bitmap
> scan puts all the relevant bits into cache, which will skew the results.

If the examples were done in the order shown, the seqscan ought to have
pretty well blown out the cache ... but I concur that it'd be
interesting to check whether repeated executions of the same plan show
markedly different times.

Also, is the index order closely correlated to the actual physical
table order?

What is work_mem set to, and does increasing it substantially make the
bitmap scan work any better?

Considering that the query is fetching about half of the table, I'd have
thought that the planner was correct to estimate that bitmap or seqscan
ought to win. For the plain indexscan to win, the order correlation
must be quite strong, and I'm also guessing that the bitmap scan must
have run into some substantial trouble (like discarding a lot of info
because of lack of work_mem).

IIRC, the planner doesn't currently try to model the effects of a bitmap
scan going into lossy mode, which is something it probably should try to
account for.

regards, tom lane


Sneak preview the all-new Yahoo.com. It's not radically different. Just radically better.

Re: why is bitmap index chosen for this query?

From
"Steinar H. Gunderson"
Date:
On Thu, May 18, 2006 at 12:38:18PM -0700, Stephen Byers wrote:
> I repeated explain analyze on the query 5 times and it came up with the same plan.

Yes, but did it end up with the same runtime? That's the interesting part --
the plan will almost always be identical between explain analyze runs given
that you haven't done anything in between them.

> You asked about index order and physical table order.  In general the index
> order is indeed close to the same order as the physical table order.
> However, this query is likely an exception.  The data is actually from a
> backup server that has filled a hole for some of the time range that I'm
> specifying in my query.

Well, it still isn't all that far-fetched to believe that the data has lots
of correlation (which helps the index scan quite a lot) that the planner
isn't able to pick up. I don't know the details here, so I can't tell you how
the correlation for such a query (WHERE a=foo and b between bar and baz) is
estimated. Something tells me someone else might, though. :-)

/* Steinar */
--
Homepage: http://www.sesse.net/

Re: why is bitmap index chosen for this query?

From
Stephen Byers
Date:
Yes, here are the runtimes for the repeated query.
Total runtime: 748716.750 ms
Total runtime: 749170.934 ms
Total runtime: 744113.594 ms
Total runtime: 746314.251 ms
Total runtime: 742083.732 ms
Thanks,
Steve

"Steinar H. Gunderson" <sgunderson@bigfoot.com> wrote:
On Thu, May 18, 2006 at 12:38:18PM -0700, Stephen Byers wrote:
> I repeated explain analyze on the query 5 times and it came up with the same plan.

Yes, but did it end up with the same runtime? That's the interesting part --
the plan will almost always be identical between explain analyze runs given
that you haven't done anything in between them.

> You asked about index order and physical table order. In general the index
> order is indeed close to the same order as the physical table order.
> However, this query is likely an exception. The data is actually from a
> backup server that has filled a hole for some of the time range that I'm
> specifying in my query.

Well, it still isn't all that far-fetched to believe that the data has lots
of correlation (which helps the index scan quite a lot) that the planner
isn't able to pick up. I don't know the details here, so I can't tell you how
the correlation for such a query (WHERE a=foo and b between bar and baz) is
estimated. Something tells me someone else might, though. :-)

/* Steinar */
--
Homepage: http://www.sesse.net/

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org


Sneak preview the all-new Yahoo.com. It's not radically different. Just radically better.

Re: why is bitmap index chosen for this query?

From
"Steinar H. Gunderson"
Date:
On Thu, May 18, 2006 at 12:53:16PM -0700, Stephen Byers wrote:
> Yes, here are the runtimes for the repeated query.
>   Total runtime: 748716.750 ms
> Total runtime: 749170.934 ms
> Total runtime: 744113.594 ms
> Total runtime: 746314.251 ms
> Total runtime: 742083.732 ms

With which options enabled? This isn't even close to any of the three times
you already posted.

/* Steinar */
--
Homepage: http://www.sesse.net/

Re: why is bitmap index chosen for this query?

From
Stephen Byers
Date:
You may be comparing the values to Tom's suggestion to bump up work_mem.  Take a look at the original posting (Total runtime: 777208.041 ms for the bitmap scan)
 
-Steve

"Steinar H. Gunderson" <sgunderson@bigfoot.com> wrote:
On Thu, May 18, 2006 at 12:53:16PM -0700, Stephen Byers wrote:
> Yes, here are the runtimes for the repeated query.
> Total runtime: 748716.750 ms
> Total runtime: 749170.934 ms
> Total runtime: 744113.594 ms
> Total runtime: 746314.251 ms
> Total runtime: 742083.732 ms

With which options enabled? This isn't even close to any of the three times
you already posted.

/* Steinar */
--
Homepage: http://www.sesse.net/

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly


Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ countries) for 2¢/min or less.

Re: why is bitmap index chosen for this query?

From
"Jim C. Nasby"
Date:
On Thu, May 18, 2006 at 12:38:18PM -0700, Stephen Byers wrote:
> I repeated explain analyze on the query 5 times and it came up with the same plan.
>
> You asked about index order and physical table order.  In general the index order is indeed close to the same order
asthe physical table order.  However, this query is likely an exception.  The data is actually from a backup server
thathas filled a hole for some of the time range that I'm specifying in my query. 

What's SELECT correlation FROM pg_stats WHERE tablename='packets' AND
attname='environment_name' show?

What's effective_cache_size and random_page_cost set to?

Also, out of curiosity, why not just use a timestamp instead of two
int's  for storing time?

>   Wow -- so what does that mean?  Do I need to leave my work_mem at 100MB??  I mentioned that my application actually
usesa cursor to walk through this data.  Even though the bitmap scan technically had the fastest time with explain
analyze,it takes a long while (20 seconds) before the results start to come back through the cursor.  Conversely, with
theindex scan, results immediately come back through the cursor method (which is more desirable).  Thoughts?   

Do you really need to use a cursor? It's generally less efficient than
doing things with a single SQL statement, depending on what exactly
you're doing.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461