Thread: Performance problems with 8.1.1 compared to 7.4.7

Performance problems with 8.1.1 compared to 7.4.7

From

Albert Cervera Areny

Date:

27 December 2005, 15:16:17

Hello,
    we have a PostgreSQL for datawarehousing. As we heard of so many enhancements
for 8.0 and 8.1 versions we dicided to upgrade from 7.4 to 8.1. I must say
that the COPY FROM processes are much faster now from 27 to 17 minutes. Some
queries where slower, but the performance problems were solved by increasing
work_mem to 8192.
    However, now we have a query that is much slower with 8.1 compared to 7.4.
The query lasts 7minutes (all the times we try) with 8.1, keeping CPU usage
at 93~97% while it lasts 25 seconds in 7.4 the first time going down to 4
seconds the following tries.
    We're not experts at all but we can't see anything strange with the
differences of EXPLAIN in the queries. Below I paste the query and the
EXPLAIN output.
    Does somebody have a clue of what could be the cause of this big difference
in performance?
    Many thanks in advance.


SELECT
        lpad(c.codigo,6,'0'),
        MIN(c.nombre),

        SUM( CASE WHEN ( res.hora_inicio >= time '00:00' AND res.hora_inicio <
time '16:00' )
                THEN (CASE WHEN res.importe_neto IS NOT NULL
                        THEN res.importe_neto ELSE 0 END)
                ELSE 0 END ) AS p1,
        SUM( CASE WHEN ( res.hora_inicio >= time '00:00' AND res.hora_inicio <
time '16:00' )
                THEN (CASE WHEN res.cantidad_servida IS NOT NULL
                        THEN res.cantidad_servida
                        ELSE 0 END)
                ELSE 0 END ) AS p2,
        SUM( CASE WHEN ( res.hora_inicio >= time '16:00' AND res.hora_inicio <
time '23:59' )
                THEN (CASE WHEN res.importe_neto IS NOT NULL
                        THEN res.importe_neto
                        ELSE 0 END)
                ELSE 0 END ) AS p3
        SUM( CASE WHEN ( res.hora_inicio >= time '16:00' AND res.hora_inicio <
time '23:59' )
                THEN (CASE WHEN res.cantidad_servida IS NOT NULL THEN
                        res.cantidad_servida
                        ELSE 0 END)
                ELSE 0 END ) AS p4
        SUM(CASE WHEN res.importe_neto IS NOT NULL
                THEN res.importe_neto
                ELSE 0 END) AS total,
        SUM(CASE WHEN res.cantidad_servida IS NOT NULL
                THEN res.cantidad_servida
                ELSE 0 END) AS total_lineas
FROM clientes c LEFT JOIN (
        SELECT
                la.cliente as cliente,
                es.hora_inicio as hora_inicio,
                la.albaran as albaran,
                la.cantidad_servida as cantidad_servida,
                la.importe_neto as importe_neto
        FROM  lineas_albaranes la
        LEFT JOIN escaner es ON la.albaran = es.albaran
        WHERE la.fecha_albaran = '20-12-2005' AND la.empresa = 1 AND
la.indicador_factura = 'F'
        ) AS res ON c.codigo = res.cliente, provincias p
WHERE p.codigo = c.provincia AND p.nombre='NAME' AND EXISTS(SELECT 1 FROM
lineas_albaranes la WHERE la.cliente=c.codigo AND la.fecha_albaran > (date
'20-12-2005' - interval '2 month') AND la.fecha_albaran <= '20-12-2005' AND
la.empresa=1 AND la.indicador_factura='F')
GROUP BY c.codigo
ORDER BY nom;

PostgreSQL 8.1.1:


QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=333769.99..333769.99 rows=2 width=61)
   Sort Key: min((c.nombre)::text)
   ->  GroupAggregate  (cost=37317.41..333769.98 rows=2 width=61)
         ->  Nested Loop  (cost=37317.41..333769.83 rows=2 width=61)
               Join Filter: ("inner".codigo = "outer".provincia)
               ->  Merge Left Join  (cost=37315.27..333758.58 rows=405
width=65)
                     Merge Cond: ("outer".codigo = "inner".cliente)
                     ->  Index Scan using clientes_pkey on clientes c
(cost=0.00..296442.28 rows=405 width=40)
                           Filter: (subplan)
                           SubPlan
                             ->  Bitmap Heap Scan on lineas_albaranes la
(cost=138.99..365.53 rows=1 width=0)
                                   Recheck Cond: ((cliente = $0) AND
((indicador_factura)::text = 'F'::text))
                                   Filter: ((fecha_albaran > '2005-10-20
00:00:00'::timestamp without time zone) AND (fecha_albaran <=
'2005-12-20'::date)AND (empresa = 1))
                                   ->  BitmapAnd  (cost=138.99..138.99 rows=57
width=0)
                                         ->  Bitmap Index Scan on
lineas_albaranes_cliente_idx  (cost=0.00..65.87 rows=11392 width=0)
                                               Index Cond: (cliente = $0)
                                         ->  Bitmap Index Scan on
lineas_albaranes_indicador_factura_idx  (cost=0.00..72.87 rows=11392 width=0)
                                               Index Cond:
((indicador_factura)::text = 'F'::text)
                     ->  Sort  (cost=37315.27..37315.28 rows=1 width=29)
                           Sort Key: la.cliente
                           ->  Nested Loop Left Join  (cost=72.87..37315.26
rows=1 width=29)
                                 ->  Bitmap Heap Scan on lineas_albaranes la
(cost=72.87..37309.24 rows=1 width=25)
                                       Recheck Cond:
((indicador_factura)::text = 'F'::text)
                                       Filter: ((fecha_albaran =
'2005-12-20'::date) AND (empresa = 1))
                                       ->  Bitmap Index Scan on
lineas_albaranes_indicador_factura_idx  (cost=0.00..72.87 rows=11392 width=0)
                                             Index Cond:
((indicador_factura)::text = 'F'::text)
                                 ->  Index Scan using escaner_pkey on escaner
es  (cost=0.00..6.01 rows=1 width=12)
                                       Index Cond: ("outer".albaran =
es.albaran)
               ->  Materialize  (cost=2.14..2.15 rows=1 width=4)
                     ->  Seq Scan on provincias p  (cost=0.00..2.14 rows=1
width=4)
                           Filter: ((nombre)::text = 'NAME'::text)
(31 rows)


PostgreSQL 7.4.7:


                            QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=270300.14..270300.21 rows=29 width=61)
   Sort Key: min((c.nombre)::text)
   ->  HashAggregate  (cost=270298.20..270299.44 rows=29 width=61)
         ->  Hash Join  (cost=270222.84..270297.62 rows=29 width=61)
               Hash Cond: ("outer".provincia = "inner".codigo)
               ->  Merge Left Join  (cost=270220.70..270280.70 rows=2899
width=65)
                     Merge Cond: ("outer".codigo = "inner".cliente)
                     ->  Sort  (cost=10928.47..10929.48 rows=405 width=40)
                           Sort Key: c.codigo
                           ->  Seq Scan on clientes c  (cost=0.00..10910.93
rows=405 width=40)
                                 Filter: (subplan)
                                 SubPlan
                                   ->  Index Scan using
lineas_albaranes_cliente_idx on lineas_albaranes la  (cost=0.00..51542.10
rows=3860 width=0)
                                         Index Cond: (cliente = $0)
                                         Filter: (((fecha_albaran)::timestamp
without time zone > '2005-10-20 00:00:00'::timestamp without time zone) AND
(fecha_albaran <= '2005-12-20'::date) AND (empresa = 1) AND
((indicador_factura)::text = 'F'::text))
                     ->  Sort  (cost=259292.23..259306.72 rows=5797 width=29)
                           Sort Key: la.cliente
                           ->  Merge Right Join  (cost=256176.76..258929.88
rows=5797 width=29)
                                 Merge Cond: ("outer".albaran =
"inner".albaran)
                                 ->  Index Scan using escaner_pkey on escaner
es  (cost=0.00..2582.64 rows=55604 width=12)
                                 ->  Sort  (cost=256176.76..256191.26
rows=5797 width=25)
                                       Sort Key: la.albaran
                                       ->  Seq Scan on lineas_albaranes la
(cost=0.00..255814.42 rows=5797 width=25)
                                             Filter: ((fecha_albaran =
'2005-12-20'::date) AND (empresa = 1) AND ((indicador_factura)::text =
'F'::text))
               ->  Hash  (cost=2.14..2.14 rows=2 width=4)
                     ->  Seq Scan on provincias p  (cost=0.00..2.14 rows=2
width=4)
                           Filter: ((nombre)::text = 'NAME'::text)
(27 rows)

Re: Performance problems with 8.1.1 compared to 7.4.7

From

Michael Fuhr

Date:

27 December 2005, 16:13:53

On Tue, Dec 27, 2005 at 05:09:28PM +0100, Albert Cervera Areny wrote:
>     However, now we have a query that is much slower with 8.1 compared to 7.4.
> The query lasts 7minutes (all the times we try) with 8.1, keeping CPU usage
> at 93~97% while it lasts 25 seconds in 7.4 the first time going down to 4
> seconds the following tries.
>     We're not experts at all but we can't see anything strange with the
> differences of EXPLAIN in the queries. Below I paste the query and the
> EXPLAIN output.

Could you post the EXPLAIN ANALYZE output of the query on both
systems?  That'll show how accurate the planner's estimates are.

Have you run ANALYZE (or VACUUM ANALYZE) on the tables in both
versions?  The row count estimates in the 8.1.1 query differ from
those in the 7.4.7 query.  Are the two versions using the same data
set?

Are your configuration settings the same in both versions?  You
mentioned increasing work_mem, but what about others like
effective_cache_size, random_page_cost, and shared_buffers?

--
Michael Fuhr

Re: Performance problems with 8.1.1 compared to 7.4.7

From

Albert Cervera Areny

Date:

27 December 2005, 17:02:23

A Dimarts 27 Desembre 2005 18:13, Michael Fuhr va escriure:
> On Tue, Dec 27, 2005 at 05:09:28PM +0100, Albert Cervera Areny wrote:
> >     However, now we have a query that is much slower with 8.1 compared to
> > 7.4. The query lasts 7minutes (all the times we try) with 8.1, keeping
> > CPU usage at 93~97% while it lasts 25 seconds in 7.4 the first time going
> > down to 4 seconds the following tries.
> >     We're not experts at all but we can't see anything strange with the
> > differences of EXPLAIN in the queries. Below I paste the query and the
> > EXPLAIN output.
>
> Could you post the EXPLAIN ANALYZE output of the query on both
> systems?  That'll show how accurate the planner's estimates are.
>
> Have you run ANALYZE (or VACUUM ANALYZE) on the tables in both
> versions?  The row count estimates in the 8.1.1 query differ from
> those in the 7.4.7 query.  Are the two versions using the same data
> set?
>
> Are your configuration settings the same in both versions?  You
> mentioned increasing work_mem, but what about others like
> effective_cache_size, random_page_cost, and shared_buffers?

Hey, thank you for your fast response, I found what the problem was.

I thought the settings were the same but work_mem was still higher in 7.4,
30Mb, so I increased 8.1 to 30Mb and it worked faster, down to 17 seconds the
first time, 2.5 seconds for the others.

Are there any "rules of thumb" to let a begginer give reasonable values to
these parameters? Not only work_mem, but also random_page_cost, and so on.
Are there any tests one can run to determine "good" values?

Thanks a lot!

--
Albert Cervera Areny
Dept. Informàtica Sedifa, S.L.

Av. Can Bordoll, 149
08202 - Sabadell (Barcelona)
Tel. 93 715 51 11
Fax. 93 715 51 12

====================================================================
........................  AVISO LEGAL  ............................
La   presente  comunicación  y sus anexos tiene como destinatario la
persona a  la  que  va  dirigida, por  lo  que  si  usted lo  recibe
por error  debe  notificarlo  al  remitente  y   eliminarlo   de  su
sistema,  no  pudiendo  utilizarlo,  total  o   parcialmente,   para
ningún  fin.  Su  contenido  puede  tener información confidencial o
protegida legalmente   y   únicamente   expresa  la  opinión     del
remitente.  El   uso   del   correo   electrónico   vía Internet  no
permite   asegurar    ni  la   confidencialidad   de   los  mensajes
ni    su    correcta     recepción.   En    el  caso   de   que   el
destinatario no consintiera la utilización  del correo  electrónico,
deberá ponerlo en nuestro conocimiento inmediatamente.
====================================================================
........................... DISCLAIMER .............................
This message and its  attachments are  intended  exclusively for the
named addressee. If you  receive  this  message  in   error,  please
immediately delete it from  your  system  and notify the sender. You
may  not  use  this message  or  any  part  of it  for any  purpose.
The   message   may  contain  information  that  is  confidential or
protected  by  law,  and  any  opinions  expressed  are those of the
individual    sender.  Internet  e-mail   guarantees   neither   the
confidentiality   nor  the  proper  receipt  of  the  message  sent.
If  the  addressee  of  this  message  does  not  consent to the use
of   internet    e-mail,    please    inform     us    inmmediately.
====================================================================

Re: Performance problems with 8.1.1 compared to 7.4.7

From

Frank Wiles

Date:

27 December 2005, 17:56:53

On Tue, 27 Dec 2005 19:02:17 +0100
Albert Cervera Areny <albert@sedifa.com> wrote:

> Are there any "rules of thumb" to let a begginer give reasonable
> values to these parameters? Not only work_mem, but also
> random_page_cost, and so on. Are there any tests one can run to
> determine "good" values?
>

  Hi Albert,

  There are several online sites that have information related to
  tuning parameters.  Here is a list of a few of them:

  http://revsys.com/writings/postgresql-performance.html

  http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

  http://www.powerpostgresql.com/Docs

  http://www.powerpostgresql.com/PerfList

  Hope these help!

 ---------------------------------
   Frank Wiles <frank@wiles.org>
   http://www.wiles.org
 ---------------------------------