Thread: Performance problems with 8.1.1 compared to 7.4.7
Hello, we have a PostgreSQL for datawarehousing. As we heard of so many enhancements for 8.0 and 8.1 versions we dicided to upgrade from 7.4 to 8.1. I must say that the COPY FROM processes are much faster now from 27 to 17 minutes. Some queries where slower, but the performance problems were solved by increasing work_mem to 8192. However, now we have a query that is much slower with 8.1 compared to 7.4. The query lasts 7minutes (all the times we try) with 8.1, keeping CPU usage at 93~97% while it lasts 25 seconds in 7.4 the first time going down to 4 seconds the following tries. We're not experts at all but we can't see anything strange with the differences of EXPLAIN in the queries. Below I paste the query and the EXPLAIN output. Does somebody have a clue of what could be the cause of this big difference in performance? Many thanks in advance. SELECT lpad(c.codigo,6,'0'), MIN(c.nombre), SUM( CASE WHEN ( res.hora_inicio >= time '00:00' AND res.hora_inicio < time '16:00' ) THEN (CASE WHEN res.importe_neto IS NOT NULL THEN res.importe_neto ELSE 0 END) ELSE 0 END ) AS p1, SUM( CASE WHEN ( res.hora_inicio >= time '00:00' AND res.hora_inicio < time '16:00' ) THEN (CASE WHEN res.cantidad_servida IS NOT NULL THEN res.cantidad_servida ELSE 0 END) ELSE 0 END ) AS p2, SUM( CASE WHEN ( res.hora_inicio >= time '16:00' AND res.hora_inicio < time '23:59' ) THEN (CASE WHEN res.importe_neto IS NOT NULL THEN res.importe_neto ELSE 0 END) ELSE 0 END ) AS p3 SUM( CASE WHEN ( res.hora_inicio >= time '16:00' AND res.hora_inicio < time '23:59' ) THEN (CASE WHEN res.cantidad_servida IS NOT NULL THEN res.cantidad_servida ELSE 0 END) ELSE 0 END ) AS p4 SUM(CASE WHEN res.importe_neto IS NOT NULL THEN res.importe_neto ELSE 0 END) AS total, SUM(CASE WHEN res.cantidad_servida IS NOT NULL THEN res.cantidad_servida ELSE 0 END) AS total_lineas FROM clientes c LEFT JOIN ( SELECT la.cliente as cliente, es.hora_inicio as hora_inicio, la.albaran as albaran, la.cantidad_servida as cantidad_servida, la.importe_neto as importe_neto FROM lineas_albaranes la LEFT JOIN escaner es ON la.albaran = es.albaran WHERE la.fecha_albaran = '20-12-2005' AND la.empresa = 1 AND la.indicador_factura = 'F' ) AS res ON c.codigo = res.cliente, provincias p WHERE p.codigo = c.provincia AND p.nombre='NAME' AND EXISTS(SELECT 1 FROM lineas_albaranes la WHERE la.cliente=c.codigo AND la.fecha_albaran > (date '20-12-2005' - interval '2 month') AND la.fecha_albaran <= '20-12-2005' AND la.empresa=1 AND la.indicador_factura='F') GROUP BY c.codigo ORDER BY nom; PostgreSQL 8.1.1: QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Sort (cost=333769.99..333769.99 rows=2 width=61) Sort Key: min((c.nombre)::text) -> GroupAggregate (cost=37317.41..333769.98 rows=2 width=61) -> Nested Loop (cost=37317.41..333769.83 rows=2 width=61) Join Filter: ("inner".codigo = "outer".provincia) -> Merge Left Join (cost=37315.27..333758.58 rows=405 width=65) Merge Cond: ("outer".codigo = "inner".cliente) -> Index Scan using clientes_pkey on clientes c (cost=0.00..296442.28 rows=405 width=40) Filter: (subplan) SubPlan -> Bitmap Heap Scan on lineas_albaranes la (cost=138.99..365.53 rows=1 width=0) Recheck Cond: ((cliente = $0) AND ((indicador_factura)::text = 'F'::text)) Filter: ((fecha_albaran > '2005-10-20 00:00:00'::timestamp without time zone) AND (fecha_albaran <= '2005-12-20'::date)AND (empresa = 1)) -> BitmapAnd (cost=138.99..138.99 rows=57 width=0) -> Bitmap Index Scan on lineas_albaranes_cliente_idx (cost=0.00..65.87 rows=11392 width=0) Index Cond: (cliente = $0) -> Bitmap Index Scan on lineas_albaranes_indicador_factura_idx (cost=0.00..72.87 rows=11392 width=0) Index Cond: ((indicador_factura)::text = 'F'::text) -> Sort (cost=37315.27..37315.28 rows=1 width=29) Sort Key: la.cliente -> Nested Loop Left Join (cost=72.87..37315.26 rows=1 width=29) -> Bitmap Heap Scan on lineas_albaranes la (cost=72.87..37309.24 rows=1 width=25) Recheck Cond: ((indicador_factura)::text = 'F'::text) Filter: ((fecha_albaran = '2005-12-20'::date) AND (empresa = 1)) -> Bitmap Index Scan on lineas_albaranes_indicador_factura_idx (cost=0.00..72.87 rows=11392 width=0) Index Cond: ((indicador_factura)::text = 'F'::text) -> Index Scan using escaner_pkey on escaner es (cost=0.00..6.01 rows=1 width=12) Index Cond: ("outer".albaran = es.albaran) -> Materialize (cost=2.14..2.15 rows=1 width=4) -> Seq Scan on provincias p (cost=0.00..2.14 rows=1 width=4) Filter: ((nombre)::text = 'NAME'::text) (31 rows) PostgreSQL 7.4.7: QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=270300.14..270300.21 rows=29 width=61) Sort Key: min((c.nombre)::text) -> HashAggregate (cost=270298.20..270299.44 rows=29 width=61) -> Hash Join (cost=270222.84..270297.62 rows=29 width=61) Hash Cond: ("outer".provincia = "inner".codigo) -> Merge Left Join (cost=270220.70..270280.70 rows=2899 width=65) Merge Cond: ("outer".codigo = "inner".cliente) -> Sort (cost=10928.47..10929.48 rows=405 width=40) Sort Key: c.codigo -> Seq Scan on clientes c (cost=0.00..10910.93 rows=405 width=40) Filter: (subplan) SubPlan -> Index Scan using lineas_albaranes_cliente_idx on lineas_albaranes la (cost=0.00..51542.10 rows=3860 width=0) Index Cond: (cliente = $0) Filter: (((fecha_albaran)::timestamp without time zone > '2005-10-20 00:00:00'::timestamp without time zone) AND (fecha_albaran <= '2005-12-20'::date) AND (empresa = 1) AND ((indicador_factura)::text = 'F'::text)) -> Sort (cost=259292.23..259306.72 rows=5797 width=29) Sort Key: la.cliente -> Merge Right Join (cost=256176.76..258929.88 rows=5797 width=29) Merge Cond: ("outer".albaran = "inner".albaran) -> Index Scan using escaner_pkey on escaner es (cost=0.00..2582.64 rows=55604 width=12) -> Sort (cost=256176.76..256191.26 rows=5797 width=25) Sort Key: la.albaran -> Seq Scan on lineas_albaranes la (cost=0.00..255814.42 rows=5797 width=25) Filter: ((fecha_albaran = '2005-12-20'::date) AND (empresa = 1) AND ((indicador_factura)::text = 'F'::text)) -> Hash (cost=2.14..2.14 rows=2 width=4) -> Seq Scan on provincias p (cost=0.00..2.14 rows=2 width=4) Filter: ((nombre)::text = 'NAME'::text) (27 rows)
On Tue, Dec 27, 2005 at 05:09:28PM +0100, Albert Cervera Areny wrote: > However, now we have a query that is much slower with 8.1 compared to 7.4. > The query lasts 7minutes (all the times we try) with 8.1, keeping CPU usage > at 93~97% while it lasts 25 seconds in 7.4 the first time going down to 4 > seconds the following tries. > We're not experts at all but we can't see anything strange with the > differences of EXPLAIN in the queries. Below I paste the query and the > EXPLAIN output. Could you post the EXPLAIN ANALYZE output of the query on both systems? That'll show how accurate the planner's estimates are. Have you run ANALYZE (or VACUUM ANALYZE) on the tables in both versions? The row count estimates in the 8.1.1 query differ from those in the 7.4.7 query. Are the two versions using the same data set? Are your configuration settings the same in both versions? You mentioned increasing work_mem, but what about others like effective_cache_size, random_page_cost, and shared_buffers? -- Michael Fuhr
A Dimarts 27 Desembre 2005 18:13, Michael Fuhr va escriure: > On Tue, Dec 27, 2005 at 05:09:28PM +0100, Albert Cervera Areny wrote: > > However, now we have a query that is much slower with 8.1 compared to > > 7.4. The query lasts 7minutes (all the times we try) with 8.1, keeping > > CPU usage at 93~97% while it lasts 25 seconds in 7.4 the first time going > > down to 4 seconds the following tries. > > We're not experts at all but we can't see anything strange with the > > differences of EXPLAIN in the queries. Below I paste the query and the > > EXPLAIN output. > > Could you post the EXPLAIN ANALYZE output of the query on both > systems? That'll show how accurate the planner's estimates are. > > Have you run ANALYZE (or VACUUM ANALYZE) on the tables in both > versions? The row count estimates in the 8.1.1 query differ from > those in the 7.4.7 query. Are the two versions using the same data > set? > > Are your configuration settings the same in both versions? You > mentioned increasing work_mem, but what about others like > effective_cache_size, random_page_cost, and shared_buffers? Hey, thank you for your fast response, I found what the problem was. I thought the settings were the same but work_mem was still higher in 7.4, 30Mb, so I increased 8.1 to 30Mb and it worked faster, down to 17 seconds the first time, 2.5 seconds for the others. Are there any "rules of thumb" to let a begginer give reasonable values to these parameters? Not only work_mem, but also random_page_cost, and so on. Are there any tests one can run to determine "good" values? Thanks a lot! -- Albert Cervera Areny Dept. Informàtica Sedifa, S.L. Av. Can Bordoll, 149 08202 - Sabadell (Barcelona) Tel. 93 715 51 11 Fax. 93 715 51 12 ==================================================================== ........................ AVISO LEGAL ............................ La presente comunicación y sus anexos tiene como destinatario la persona a la que va dirigida, por lo que si usted lo recibe por error debe notificarlo al remitente y eliminarlo de su sistema, no pudiendo utilizarlo, total o parcialmente, para ningún fin. Su contenido puede tener información confidencial o protegida legalmente y únicamente expresa la opinión del remitente. El uso del correo electrónico vía Internet no permite asegurar ni la confidencialidad de los mensajes ni su correcta recepción. En el caso de que el destinatario no consintiera la utilización del correo electrónico, deberá ponerlo en nuestro conocimiento inmediatamente. ==================================================================== ........................... DISCLAIMER ............................. This message and its attachments are intended exclusively for the named addressee. If you receive this message in error, please immediately delete it from your system and notify the sender. You may not use this message or any part of it for any purpose. The message may contain information that is confidential or protected by law, and any opinions expressed are those of the individual sender. Internet e-mail guarantees neither the confidentiality nor the proper receipt of the message sent. If the addressee of this message does not consent to the use of internet e-mail, please inform us inmmediately. ====================================================================
On Tue, 27 Dec 2005 19:02:17 +0100 Albert Cervera Areny <albert@sedifa.com> wrote: > Are there any "rules of thumb" to let a begginer give reasonable > values to these parameters? Not only work_mem, but also > random_page_cost, and so on. Are there any tests one can run to > determine "good" values? > Hi Albert, There are several online sites that have information related to tuning parameters. Here is a list of a few of them: http://revsys.com/writings/postgresql-performance.html http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html http://www.powerpostgresql.com/Docs http://www.powerpostgresql.com/PerfList Hope these help! --------------------------------- Frank Wiles <frank@wiles.org> http://www.wiles.org ---------------------------------