Thread: Bitmap Heap scan 8.1/8.2

Bitmap Heap scan 8.1/8.2

From
Martin Marques
Date:
I have to PG servers, one ver. 8.1.9 and the other 8.2.4.

I was checking a query out and found that with the exact same DB (same
data in it) and the same query I get different plans, and significantly
higher time in 8.2:

On 8.1 I get:

test=> explain analyze SELECT * FROM prestamos WHERE biblioteca = 19 AND
  vencimiento < now() AND NOT devuelto ORDER BY vencimiento DESC;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------
  Sort  (cost=2793.74..2795.82 rows=832 width=20) (actual
time=25.795..25.832 rows=49 loops=1)
    Sort Key: vencimiento
    ->  Bitmap Heap Scan on prestamos  (cost=850.43..2753.39 rows=832
width=20) (actual time=20.747..25.529 rows=49 loops=1)
          Filter: ((biblioteca = 19) AND (vencimiento < now()) AND (NOT
devuelto))
          ->  Bitmap Index Scan on prestamos_objetos_devuelto_idx
(cost=0.00..850.43 rows=2200 width=0) (actual time=20.265..20.265
rows=2301 loops=1)
                Index Cond: (devuelto = false)
  Total runtime: 25.971 ms
(7 filas)

On 8.2:

test=> explain analyze SELECT * FROM prestamos WHERE biblioteca = 19 AND
  vencimiento < now() AND NOT devuelto ORDER BY vencimiento DESC;
                                                    QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
  Sort  (cost=4100.77..4102.77 rows=800 width=20) (actual
time=95.082..95.103 rows=49 loops=1)
    Sort Key: vencimiento
    ->  Seq Scan on prestamos  (cost=0.00..4062.20 rows=800 width=20)
(actual time=7.293..82.778 rows=49 loops=1)
          Filter: ((biblioteca = 19) AND (vencimiento < now()) AND (NOT
devuelto))
  Total runtime: 95.165 ms
(5 filas)

It's like it's ignoring the HEAP scan. Why? It's obviously (unless I'm
totally wrong) faster in this case.

--
  21:50:04 up 2 days,  9:07,  0 users,  load average: 0.92, 0.37, 0.18
---------------------------------------------------------
Lic. Martín Marqués         |   SELECT 'mmarques' ||
Centro de Telemática        |       '@' || 'unl.edu.ar';
Universidad Nacional        |   DBA, Programador,
     del Litoral             |   Administrador
---------------------------------------------------------

Re: Bitmap Heap scan 8.1/8.2

From
"Pavel Stehule"
Date:
Hello

I am unsure, did you check config values?

Pavel

2007/10/22, Martin Marques <martin@bugs.unl.edu.ar>:
> I have to PG servers, one ver. 8.1.9 and the other 8.2.4.
>
> I was checking a query out and found that with the exact same DB (same
> data in it) and the same query I get different plans, and significantly
> higher time in 8.2:
>
> On 8.1 I get:
>
> test=> explain analyze SELECT * FROM prestamos WHERE biblioteca = 19 AND
>   vencimiento < now() AND NOT devuelto ORDER BY vencimiento DESC;
>
> QUERY PLAN
>
-------------------------------------------------------------------------------------------------------------------------------------------------------
>   Sort  (cost=2793.74..2795.82 rows=832 width=20) (actual
> time=25.795..25.832 rows=49 loops=1)
>     Sort Key: vencimiento
>     ->  Bitmap Heap Scan on prestamos  (cost=850.43..2753.39 rows=832
> width=20) (actual time=20.747..25.529 rows=49 loops=1)
>           Filter: ((biblioteca = 19) AND (vencimiento < now()) AND (NOT
> devuelto))
>           ->  Bitmap Index Scan on prestamos_objetos_devuelto_idx
> (cost=0.00..850.43 rows=2200 width=0) (actual time=20.265..20.265
> rows=2301 loops=1)
>                 Index Cond: (devuelto = false)
>   Total runtime: 25.971 ms
> (7 filas)
>
> On 8.2:
>
> test=> explain analyze SELECT * FROM prestamos WHERE biblioteca = 19 AND
>   vencimiento < now() AND NOT devuelto ORDER BY vencimiento DESC;
>                                                     QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------
>   Sort  (cost=4100.77..4102.77 rows=800 width=20) (actual
> time=95.082..95.103 rows=49 loops=1)
>     Sort Key: vencimiento
>     ->  Seq Scan on prestamos  (cost=0.00..4062.20 rows=800 width=20)
> (actual time=7.293..82.778 rows=49 loops=1)
>           Filter: ((biblioteca = 19) AND (vencimiento < now()) AND (NOT
> devuelto))
>   Total runtime: 95.165 ms
> (5 filas)
>
> It's like it's ignoring the HEAP scan. Why? It's obviously (unless I'm
> totally wrong) faster in this case.
>
> --
>   21:50:04 up 2 days,  9:07,  0 users,  load average: 0.92, 0.37, 0.18
> ---------------------------------------------------------
> Lic. Martín Marqués         |   SELECT 'mmarques' ||
> Centro de Telemática        |       '@' || 'unl.edu.ar';
> Universidad Nacional        |   DBA, Programador,
>      del Litoral             |   Administrador
> ---------------------------------------------------------
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>

Re: Bitmap Heap scan 8.1/8.2

From
Martin Marques
Date:
Pavel Stehule wrote:
> Hello
>
> I am unsure, did you check config values?

Don't know which ones you are talking about, but all enable_* are set to on.

Anything else?

--
  21:50:04 up 2 days,  9:07,  0 users,  load average: 0.92, 0.37, 0.18
---------------------------------------------------------
Lic. Martín Marqués         |   SELECT 'mmarques' ||
Centro de Telemática        |       '@' || 'unl.edu.ar';
Universidad Nacional        |   DBA, Programador,
     del Litoral             |   Administrador
---------------------------------------------------------

Re: Bitmap Heap scan 8.1/8.2

From
"Pavel Stehule"
Date:
2007/10/22, Martin Marques <martin@bugs.unl.edu.ar>:
> Pavel Stehule wrote:
> > Hello
> >
> > I am unsure, did you check config values?
>
> Don't know which ones you are talking about, but all enable_* are set to on.
>
> Anything else?
>

shared_buffers
work_mem
effective_cache_size

Pavel


> --
>   21:50:04 up 2 days,  9:07,  0 users,  load average: 0.92, 0.37, 0.18
> ---------------------------------------------------------
> Lic. Martín Marqués         |   SELECT 'mmarques' ||
> Centro de Telemática        |       '@' || 'unl.edu.ar';
> Universidad Nacional        |   DBA, Programador,
>      del Litoral             |   Administrador
> ---------------------------------------------------------
>

Re: Bitmap Heap scan 8.1/8.2

From
Tomas Vondra
Date:
> 2007/10/22, Martin Marques <martin@bugs.unl.edu.ar>:
>> Pavel Stehule wrote:
>>> Hello
>>>
>>> I am unsure, did you check config values?
>> Don't know which ones you are talking about, but all enable_* are set to on.
>>
>> Anything else?
>>
>
> shared_buffers
> work_mem
> effective_cache_size
>
> Pavel

Well, the cost_* values might be interesting too. That is

seq_page_cost
random_page_cost
cpu_tuple_cost
cpu_index_tuple_cost
cpu_operator_cost
effective_cache_size

(the first one is available in 8.2 only). My guess is that there are
different values, and the 8.2 overestimates the index scan - which seems
to be incorrect.

Try to disable the seqscan in the 8.2 database (set enable_seqscan =
off), and run the explain analyze again. This time it should choose
different query plan - maybe the index scan as in 8.1.

Another thing you might try is setting the cost values to the same
values in both databases - it might help.

And what does it mean by 'same data' - have you vacuumed / analyzed both
of them? What does this return:

select relname, relpages, reltuples from pg_class where relname =
'prestamos' or relname='prestamos_objetos_devuelto_idx';

That should return number of tuples / occupied pages in the table and index.

Tomas

Re: Bitmap Heap scan 8.1/8.2

From
Martin Marques
Date:
Pavel Stehule wrote:
> 2007/10/22, Martin Marques <martin@bugs.unl.edu.ar>:
>> Pavel Stehule wrote:
>>> Hello
>>>
>>> I am unsure, did you check config values?
>> Don't know which ones you are talking about, but all enable_* are set to on.
>>
>> Anything else?
>>
>
> shared_buffers

8.1:

  16000


8.2:

  400MB


> work_mem

8.1:

  8192

8.2:

  4MB


> effective_cache_size

8.1:

  1000

8.2:

  128MB


--
  21:50:04 up 2 days,  9:07,  0 users,  load average: 0.92, 0.37, 0.18
---------------------------------------------------------
Lic. Martín Marqués         |   SELECT 'mmarques' ||
Centro de Telemática        |       '@' || 'unl.edu.ar';
Universidad Nacional        |   DBA, Programador,
     del Litoral             |   Administrador
---------------------------------------------------------

Re: Bitmap Heap scan 8.1/8.2

From
"Pavel Stehule"
Date:
2007/10/22, Martin Marques <martin@bugs.unl.edu.ar>:
> Pavel Stehule wrote:
> > 2007/10/22, Martin Marques <martin@bugs.unl.edu.ar>:
> >> Pavel Stehule wrote:
> >>> Hello
> >>>
> >>> I am unsure, did you check config values?
> >> Don't know which ones you are talking about, but all enable_* are set to on.
> >>
> >> Anything else?
> >>
> >
> > shared_buffers
>
> 8.1:
>
>   16000
~ 128M
>
>
> 8.2:
>
>   400MB
>
>
> > work_mem
>
> 8.1:
>
>   8192
8M !!!! 8>4
>
> 8.2:
>
>   4MB
>
>
> > effective_cache_size
>
> 8.1:
>
>   1000
8M
>
> 8.2:
>
>   128MB
>
>

try

set work_mem to '8MB';
and
explain analyze select ..

Re: Bitmap Heap scan 8.1/8.2

From
Martin Marques
Date:
Tomas Vondra wrote:
>
>> 2007/10/22, Martin Marques <martin@bugs.unl.edu.ar>:
>>> Pavel Stehule wrote:
>>>> Hello
>>>>
>>>> I am unsure, did you check config values?
>>> Don't know which ones you are talking about, but all enable_* are set
>>> to on.
>>>
>>> Anything else?
>>>
>>
>> shared_buffers
>> work_mem
>> effective_cache_size
>>
>> Pavel
>
> Well, the cost_* values might be interesting too. That is
>
> seq_page_cost
> random_page_cost
> cpu_tuple_cost
> cpu_index_tuple_cost
> cpu_operator_cost
> effective_cache_size
>
> (the first one is available in 8.2 only). My guess is that there are
> different values, and the 8.2 overestimates the index scan - which seems
> to be incorrect.
>
> Try to disable the seqscan in the 8.2 database (set enable_seqscan =
> off), and run the explain analyze again. This time it should choose
> different query plan - maybe the index scan as in 8.1.

siprebi-1.4=> SHOW enable_seqscan ;
  enable_seqscan
----------------
  off
(1 fila)

siprebi-1.4=> explain analyze SELECT usuarios,nticket,objeto,vencimiento
FROM prestamos WHERE biblioteca = 19 AND  vencimiento < now() AND NOT
devuelto ORDER BY vencimiento DESC;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------
  Sort  (cost=4365.26..4367.26 rows=800 width=20) (actual
time=30.736..30.755 rows=49 loops=1)
    Sort Key: vencimiento
    ->  Bitmap Heap Scan on prestamos  (cost=2502.69..4326.68 rows=800
width=20) (actual time=28.983..30.644 rows=49 loops=1)
          Filter: ((biblioteca = 19) AND (vencimiento < now()) AND (NOT
devuelto))
          ->  Bitmap Index Scan on prestamos_usuarios_devuelto_idx
(cost=0.00..2502.49 rows=1976 width=0) (actual time=28.874..28.874
rows=2300 loops=1)
                Index Cond: (devuelto = false)
  Total runtime: 45.725 ms

Here I see that the time has been halved. So this plan is much better.

> Another thing you might try is setting the cost values to the same
> values in both databases - it might help.

Which ones and how?

> And what does it mean by 'same data' - have you vacuumed / analyzed both
> of them? What does this return:

dumped the 8.1 DB and restored in the 8.2. Both were vacummed analyze to
have accurate stats.

> select relname, relpages, reltuples from pg_class where relname =
> 'prestamos' or relname='prestamos_objetos_devuelto_idx';
>
> That should return number of tuples / occupied pages in the table and
> index.

select relname, relpages, reltuples from pg_class where relname =
'prestamos' or relname='prestamos_objetos_devuelto_idx';
             relname             | relpages | reltuples
--------------------------------+----------+-----------
  prestamos_objetos_devuelto_idx |      373 |    134697
  prestamos                      |     1705 |    134697




--
  21:50:04 up 2 days,  9:07,  0 users,  load average: 0.92, 0.37, 0.18
---------------------------------------------------------
Lic. Martín Marqués         |   SELECT 'mmarques' ||
Centro de Telemática        |       '@' || 'unl.edu.ar';
Universidad Nacional        |   DBA, Programador,
     del Litoral             |   Administrador
---------------------------------------------------------


Re: Bitmap Heap scan 8.1/8.2

From
Martin Marques
Date:
Pavel Stehule wrote:
> 2007/10/22, Martin Marques <martin@bugs.unl.edu.ar>:
>> Pavel Stehule wrote:
>>> 2007/10/22, Martin Marques <martin@bugs.unl.edu.ar>:
>>>> Pavel Stehule wrote:
>>>>> Hello
>>>>>
>>>>> I am unsure, did you check config values?
>>>> Don't know which ones you are talking about, but all enable_* are set to on.
>>>>
>>>> Anything else?
>>>>
>>> shared_buffers
>> 8.1:
>>
>>   16000
> ~ 128M
>>
>> 8.2:
>>
>>   400MB
>>
>>
>>> work_mem
>> 8.1:
>>
>>   8192
> 8M !!!! 8>4
>> 8.2:
>>
>>   4MB
>>
>>
>>> effective_cache_size
>> 8.1:
>>
>>   1000
> 8M
>> 8.2:
>>
>>   128MB
>>
>>
>
> try
>
> set work_mem to '8MB';
> and
> explain analyze select ..

These things didn't help. What changed the plan completely was this:

seq_page_cost = 5.0                     # measured on an arbitrary scale
cpu_tuple_cost = 0.05                   # same scale as above

Specially the first one. Now I get this:

explain analyze SELECT usuarios,nticket,objeto,vencimiento FROM
prestamos WHERE biblioteca = 19 AND  vencimiento < now() AND NOT
devuelto ORDER BY vencimiento DESC;

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------
  Sort  (cost=7058.86..7060.86 rows=800 width=20) (actual
time=22.850..22.888 rows=95 loops=1)
    Sort Key: vencimiento
    ->  Index Scan using prestamos_objetos_devuelto_idx on prestamos
(cost=0.00..7020.28 rows=800 width=20) (actual time=0.346..22.590
rows=95 loops=1)
          Index Cond: (devuelto = false)
          Filter: ((biblioteca = 19) AND (vencimiento < now()) AND (NOT
devuelto))
  Total runtime: 22.973 ms


--
  21:50:04 up 2 days,  9:07,  0 users,  load average: 0.92, 0.37, 0.18
---------------------------------------------------------
Lic. Martín Marqués         |   SELECT 'mmarques' ||
Centro de Telemática        |       '@' || 'unl.edu.ar';
Universidad Nacional        |   DBA, Programador,
     del Litoral             |   Administrador
---------------------------------------------------------

Re: Bitmap Heap scan 8.1/8.2

From
Martin Marques
Date:
Martin Marques escribió:
> Pavel Stehule wrote:
>>
>> try
>>
>> set work_mem to '8MB';
>> and
>> explain analyze select ..
>
> These things didn't help. What changed the plan completely was this:
>
> seq_page_cost = 5.0                     # measured on an arbitrary scale
> cpu_tuple_cost = 0.05                   # same scale as above

Can someone explain how this parameters are measured? What is 5.0 in
this case for seq_page_cost?

Re: Bitmap Heap scan 8.1/8.2

From
"Pavel Stehule"
Date:
2007/10/23, Martin Marques <martin@bugs.unl.edu.ar>:
> Martin Marques escribió:
> > Pavel Stehule wrote:
> >>
> >> try
> >>
> >> set work_mem to '8MB';
> >> and
> >> explain analyze select ..
> >
> > These things didn't help. What changed the plan completely was this:
> >
> > seq_page_cost = 5.0                     # measured on an arbitrary scale
> > cpu_tuple_cost = 0.05                   # same scale as above
>
> Can someone explain how this parameters are measured? What is 5.0 in
> this case for seq_page_cost?
>

http://www.postgresql.org/docs/8.2/interactive/runtime-config-query.html

5.0 means so seq scan will be expensive for optimaliser, and
optimaliser will prefer index scan.

Re: Bitmap Heap scan 8.1/8.2

From
Alvaro Herrera
Date:
Martin Marques escribió:
> Martin Marques escribió:
>> Pavel Stehule wrote:
>>>
>>> try
>>>
>>> set work_mem to '8MB';
>>> and
>>> explain analyze select ..
>> These things didn't help. What changed the plan completely was this:
>> seq_page_cost = 5.0                     # measured on an arbitrary scale
>> cpu_tuple_cost = 0.05                   # same scale as above
>
> Can someone explain how this parameters are measured? What is 5.0 in this
> case for seq_page_cost?

It's an arbitrary number, based on which all the other numbers are
measured.

What people generally do around here is mess with random_page_cost, and
leave seq_page_cost alone.  Often, it's the ratio
seq_page_cost/random_page_cost what's most important to the cost
equations results.  (seq_page_cost wasn't tunable at all until
recently, say 8.1 or 8.2 AFAIR).

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Bitmap Heap scan 8.1/8.2

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Martin Marques escribi�:
>>> seq_page_cost = 5.0                     # measured on an arbitrary scale

> What people generally do around here is mess with random_page_cost, and
> leave seq_page_cost alone.

It's also worth pointing out that having seq_page_cost >
random_page_cost is simply not sane, and is very likely to result in
performance problems in other queries.

            regards, tom lane

Re: Bitmap Heap scan 8.1/8.2

From
Martin Marques
Date:
Alvaro Herrera wrote:
>
> It's an arbitrary number, based on which all the other numbers are
> measured.

Now that I read more intensively he docs I see that all the cost
parameters are related one with the other.

> What people generally do around here is mess with random_page_cost, and
> leave seq_page_cost alone.  Often, it's the ratio
> seq_page_cost/random_page_cost what's most important to the cost
> equations results.  (seq_page_cost wasn't tunable at all until
> recently, say 8.1 or 8.2 AFAIR).

Ok, now the 8.1 server has a RAID1 hardware board with SCSI disks, and
the 8.2 is just a PentiumD with SATA disks (it's my desktop PC where I
do tests). Should I have a lower random_page_cost on a machine that is
likely to have a lower disk IO speed?

--
  21:50:04 up 2 days,  9:07,  0 users,  load average: 0.92, 0.37, 0.18
---------------------------------------------------------
Lic. Martín Marqués         |   SELECT 'mmarques' ||
Centro de Telemática        |       '@' || 'unl.edu.ar';
Universidad Nacional        |   DBA, Programador,
     del Litoral             |   Administrador
---------------------------------------------------------

Re: Bitmap Heap scan 8.1/8.2

From
Tom Lane
Date:
Martin Marques <martin@bugs.unl.edu.ar> writes:
> Ok, now the 8.1 server has a RAID1 hardware board with SCSI disks, and
> the 8.2 is just a PentiumD with SATA disks (it's my desktop PC where I
> do tests). Should I have a lower random_page_cost on a machine that is
> likely to have a lower disk IO speed?

I'd guess the other way round: cheaper disks are likely to have worse
seek speeds, which translates to random accesses being proportionally
slower.

            regards, tom lane