Re: Slow HashAggregate/cache access - Mailing list pgsql-performance

From David Rowley
Subject Re: Slow HashAggregate/cache access
Date
Msg-id CAKJS1f9TNE4QZaFBkZCOqNC+inaHhnhJmzG3z=SRpoPYpRYDzg@mail.gmail.com
Whole thread Raw
In response to Re: Slow HashAggregate/cache access  (Andreas Joseph Krogh <andreas@visena.com>)
List pgsql-performance
On 6 August 2015 at 22:05, Andreas Joseph Krogh <andreas@visena.com> wrote:
På torsdag 06. august 2015 kl. 03:09:55, skrev Alexandre de Arruda Paes <adaldeia@gmail.com>:
 
Notice the seq scan on t1 instead of the index scan on t1_a_idx.
 
A way around this is to manually push the predicate down into the subquery:
 
explain select t2.a,s.sumv from (select a,sum(v) sumv from t1 where t1.a <= 1 group by a) s inner join t2 on t2.a = s.a where t2.a <= 1;
                                  QUERY PLAN
-------------------------------------------------------------------------------
 Nested Loop  (cost=0.42..21.98 rows=1 width=12)
   Join Filter: (t1.a = t2.a)
   ->  GroupAggregate  (cost=0.42..4.46 rows=1 width=8)
         Group Key: t1.a
         ->  Index Scan using t1_a_idx on t1  (cost=0.42..4.44 rows=1 width=8)
               Index Cond: (a <= 1)
   ->  Seq Scan on t2  (cost=0.00..17.50 rows=1 width=4)
         Filter: (a <= 1)
(8 rows) 
 
 
 
Hi David,
 
You are right. If the subquery includes the same filters of the main select (of the existing fields, sure), the times down to the floor (50 ms in the first execution and *18* ms by cache. Superb! ):
 
(...)  (SELECT SUM(fr13quant) AS fr13TotQtd, fr01codemp, fr02codigo, fr13dtlanc, SUM(COALESCE( fr13quant, 0) * CAST(COALESCE( fr13preco, 0) AS NUMERIC(18,10))) AS fr13VrTot
FROM FR13T1 WHERE (fr01codemp = '1' and fr13dtlanc >= '01/05/2014') AND (fr02codigo >= '0' and fr02codigo <= '9999999999') AND (fr13dtlanc <= '31/05/2014') GROUP BY fr01codemp, fr02codigo, fr13dtlanc ) T4 ON T4.fr01codemp = T1.fr01codemp AND T4.fr02codigo = T1.fr02codigo AND T4.fr13dtlanc = T1.fr13dtlanc)
(...)
 
                                                                                                       QUERY PLAN                                                                                                        
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=5770.32..7894.70 rows=1 width=130) (actual time=13.715..18.366 rows=2 loops=1)
   Join Filter: ((fr13t3.fr01codemp = t1.fr01codemp) AND (fr13t3.fr02codigo = t1.fr02codigo) AND (fr13t3.fr13dtlanc = t1.fr13dtlanc))
   Rows Removed by Join Filter: 368
   Buffers: shared hit=5920
   ->  Nested Loop Left Join  (cost=5764.18..7887.47 rows=1 width=98) (actual time=13.529..18.108 rows=2 loops=1)
         Join Filter: (t3.fr01codemp = t1.fr01codemp)
         Buffers: shared hit=5918
         ->  Nested Loop Left Join  (cost=5764.04..7887.30 rows=1 width=87) (actual time=13.519..18.094 rows=2 loops=1)
               Join Filter: ((fr13t1.fr01codemp = t1.fr01codemp) AND (fr13t1.fr02codigo = t1.fr02codigo) AND (fr13t1.fr13dtlanc = t1.fr13dtlanc))
               Rows Removed by Join Filter: 11144
               Buffers: shared hit=5914
               ->  Nested Loop Left Join  (cost=0.70..2098.42 rows=1 width=23) (actual time=0.796..2.071 rows=2 loops=1)
                     Buffers: shared hit=181
                     ->  Index Scan using ufr13t2 on fr13t t1  (cost=0.42..2094.11 rows=1 width=19) (actual time=0.787..2.054 rows=2 loops=1)
                           Index Cond: ((fr01codemp = 1::smallint) AND (fr13dtlanc >= '2014-05-01'::date) AND (fr13dtlanc <= '2014-05-31'::date))
                           Filter: ((fr02codigo >= 0::numeric) AND (fr02codigo <= 9999999999::numeric) AND (fr13codpr = 60732))
                           Rows Removed by Filter: 5621
                           Buffers: shared hit=175
                     ->  Index Scan using fr02t_pkey on fr02t t2  (cost=0.28..4.30 rows=1 width=12) (actual time=0.005..0.005 rows=1 loops=2)
                           Index Cond: ((fr01codemp = t1.fr01codemp) AND (fr01codemp = 1::smallint) AND (fr02codigo = t1.fr02codigo))
                           Buffers: shared hit=6
               ->  HashAggregate  (cost=5763.34..5770.15 rows=681 width=21) (actual time=5.576..6.787 rows=5573 loops=2)
                     Buffers: shared hit=5733
                     ->  Index Scan using ufr13t15 on fr13t1  (cost=0.42..5644.31 rows=6802 width=21) (actual time=0.020..3.371 rows=7053 loops=1)
                           Index Cond: ((fr01codemp = 1::smallint) AND (fr13dtlanc >= '2014-05-01'::date) AND (fr13dtlanc <= '2014-05-31'::date) AND (fr02codigo >= 0::numeric) AND (fr02codigo <= 9999999999::numeric))
                           Buffers: shared hit=5733
         ->  Index Scan using fr09t_pkey on fr09t t3  (cost=0.14..0.16 rows=1 width=15) (actual time=0.005..0.005 rows=1 loops=2)
               Index Cond: ((fr01codemp = 1::smallint) AND (fr09cod = t2.fr09cod))
               Buffers: shared hit=4
   ->  HashAggregate  (cost=6.14..6.43 rows=29 width=17) (actual time=0.056..0.086 rows=184 loops=2)
         Buffers: shared hit=2
         ->  Seq Scan on fr13t3  (cost=0.00..4.30 rows=184 width=17) (actual time=0.003..0.027 rows=184 loops=1)
               Filter: (fr01codemp = 1::smallint)
               Buffers: shared hit=2
 Total runtime: 18.528 ms
(35 rows)
 
 
Tomorrow I will try to do the same with the other slow query, reporting here.
 
It will be interesting to see how Oracle and SQL-Server perform with the re-written query too.
Thanks.


Glad that's looking better for you.

I'd guess that they're likely already pushing down those predicates into the subquery going by the execution times that you posted. 

I can't imagine Oracle can perform a seq scan / table scan that much faster than Postgres

Interested to hear the results of your tests though.

Regards

David Rowley

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
 

pgsql-performance by date:

Previous
From: Andreas Joseph Krogh
Date:
Subject: Re: Slow HashAggregate/cache access
Next
From: Rural Hunter
Date:
Subject: Query Plan Performance on Partitioned Table