Fast HashJoin only after a cluster/recreate table - Mailing list pgsql-performance

From Alexandre de Arruda Paes
Subject Fast HashJoin only after a cluster/recreate table
Date
Msg-id CAGewt-t0izL+zaK6B3YpMDbkBoTU=ppBUY1exnFTVT4J6PV67Q@mail.gmail.com
Whole thread Raw
Responses Re: Fast HashJoin only after a cluster/recreate table
List pgsql-performance
Hi,

In the query below, the planner choose an extreme slow mergejoin(380 seconds). 'Vacuum analyze' can't help.
If I CLUSTER (or recreate) table ES09T1, the planner choose a faster hashjoin (about 10 seconds). But, obviously, I can't do that with the users connected.
After some time after cluster(generally in the same day), the problem returns. Autovacuum is on, but the tables are vacuumed forced after pg_dump, 3 times in a day (00:00 - 12:00 - 23:00).

Postgresql 9.4.5
128GB RAM/10xRAID10 SAS 15k
shared_buffers = 8GB                                                                        
work_mem = 256MB                                                                                                                        
maintenance_work_mem = 16GB
random_page_cost = 2.0                                                                                   
effective_cache_size = 120GB       


db=# explain (buffers,analyze) SELECT T1.es09item, T1.es09status, T3.es09usuari, T3.es09datreq, T2.es08desdoc AS es09desdoc, T1.es09numdoc, T1.es09tipdoc AS es09tipdoc, T1.es09codemp, COALESCE( T4.es09quatre, 0) AS es09quatre FROM (((ES09T1 T1 LEFT JOIN ES08T T2 ON T2.es08tipdoc = T1.es09tipdoc) LEFT JOIN ES09T T3 ON T3.es09codemp = T1.es09codemp AND T3.es09tipdoc = T1.es09tipdoc AND T3.es09numdoc = T1.es09numdoc) LEFT JOIN (SELECT COUNT(*) AS es09quatre, es09codemp, es09tipdoc, es09numdoc FROM ES09T1 GROUP BY es09codemp, es09tipdoc, es09numdoc ) T4 ON T4.es09codemp = T1.es09codemp AND T4.es09tipdoc = T1.es09tipdoc AND T4.es09numdoc = T1.es09numdoc) WHERE (T1.es09codemp = 1) and (T3.es09datreq >= '2016-02-02' and T3.es09datreq <= '2016-02-02') and (T3.es09usuari like '%%%%%%%%%%%%%%%%%%%%') and (T1.es09tipdoc like '%%%%%') ORDER BY T1.es09codemp, T1.es09numdoc DESC, T1.es09tipdoc;
                                                                              QUERY PLAN                                                                              
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=289546.93..289546.94 rows=2 width=78) (actual time=380405.796..380405.929 rows=2408 loops=1)
   Sort Key: t1.es09numdoc, t1.es09tipdoc
   Sort Method: quicksort  Memory: 435kB
   Buffers: shared hit=82163
   ->  Merge Left Join  (cost=47.09..289546.92 rows=2 width=78) (actual time=1133.077..380398.160 rows=2408 loops=1)
         Merge Cond: (t1.es09tipdoc = es09t1.es09tipdoc)
         Join Filter: ((es09t1.es09codemp = t1.es09codemp) AND (es09t1.es09numdoc = t1.es09numdoc))
         Rows Removed by Join Filter: 992875295
         Buffers: shared hit=82163
         ->  Merge Left Join  (cost=46.53..49.29 rows=2 width=70) (actual time=12.206..18.155 rows=2408 loops=1)
               Merge Cond: (t1.es09tipdoc = t2.es08tipdoc)
               Buffers: shared hit=6821
               ->  Sort  (cost=9.19..9.19 rows=2 width=44) (actual time=11.611..12.248 rows=2408 loops=1)
                     Sort Key: t1.es09tipdoc
                     Sort Method: quicksort  Memory: 285kB
                     Buffers: shared hit=6814
                     ->  Nested Loop  (cost=1.11..9.18 rows=2 width=44) (actual time=0.040..10.398 rows=2408 loops=1)
                           Buffers: shared hit=6814
                           ->  Index Scan using ad_es09t_1 on es09t t3  (cost=0.56..4.58 rows=1 width=42) (actual time=0.020..0.687 rows=1212 loops=1)
                                 Index Cond: ((es09codemp = 1) AND (es09datreq >= '2016-02-02'::date) AND (es09datreq <= '2016-02-02'::date))
                                 Filter: (es09usuari ~~ '%%%%%%%%%%%%%%%%%%%%'::text)
                                 Buffers: shared hit=108
                           ->  Index Scan using es09t1_pkey on es09t1 t1  (cost=0.56..4.59 rows=1 width=19) (actual time=0.006..0.007 rows=2 loops=1212)
                                 Index Cond: ((es09codemp = 1) AND (es09tipdoc = t3.es09tipdoc) AND (es09numdoc = t3.es09numdoc))
                                 Filter: (es09tipdoc ~~ '%%%%%'::text)
                                 Buffers: shared hit=6706
               ->  Sort  (cost=37.35..38.71 rows=547 width=32) (actual time=0.592..2.206 rows=2919 loops=1)
                     Sort Key: t2.es08tipdoc
                     Sort Method: quicksort  Memory: 67kB
                     Buffers: shared hit=7
                     ->  Seq Scan on es08t t2  (cost=0.00..12.47 rows=547 width=32) (actual time=0.003..0.126 rows=547 loops=1)
                           Buffers: shared hit=7
         ->  Materialize  (cost=0.56..287644.85 rows=716126 width=23) (actual time=0.027..68577.800 rows=993087854 loops=1)
               Buffers: shared hit=75342
               ->  GroupAggregate  (cost=0.56..278693.28 rows=716126 width=15) (actual time=0.025..4242.453 rows=3607573 loops=1)
                     Group Key: es09t1.es09codemp, es09t1.es09tipdoc, es09t1.es09numdoc
                     Buffers: shared hit=75342
                     ->  Index Only Scan using es09t1_pkey on es09t1  (cost=0.56..199919.49 rows=7161253 width=15) (actual time=0.016..1625.031 rows=7160921 loops=1)
                           Index Cond: (es09codemp = 1)
                           Heap Fetches: 51499
                           Buffers: shared hit=75342
 Planning time: 50.129 ms
 Execution time: 380419.435 ms
(43 rows)


db=# vacuum ANALYZE es09t1;
VACUUM


db=# explain SELECT T1.es09item, T1.es09status, T3.es09usuari, T3.es09datreq, T2.es08desdoc AS es09desdoc, T1.es09numdoc, T1.es09tipdoc AS es09tipdoc, T1.es09codemp, COALESCE( T4.es09quatre, 0) AS es09quatre FROM (((ES09T1 T1 LEFT
 JOIN ES08T T2 ON T2.es08tipdoc = T1.es09tipdoc) LEFT JOIN ES09T T3 ON T3.es09codemp = T1.es09codemp AND T3.es09tipdoc = T1.es09tipdoc AND T3.es09numdoc = T1.es09numdoc) LEFT JOIN (SELECT COUNT(*) AS es09quatre, es09codemp, es09tipdoc, e
s09numdoc FROM ES09T1 GROUP BY es09codemp, es09tipdoc, es09numdoc ) T4 ON T4.es09codemp = T1.es09codemp AND T4.es09tipdoc = T1.es09tipdoc AND T4.es09numdoc = T1.es09numdoc) WHERE (T1.es09codemp = 1) and (T3.es09datreq >= '2016-02-02' and
 T3.es09datreq <= '2016-02-02') and (T3.es09usuari like '%%%%%%%%%%%%%%%%%%%%') and (T1.es09tipdoc like '%%%%%') ORDER BY T1.es09codemp, T1.es09numdoc DESC, T1.es09tipdoc;                                                                  
                                                                  QUERY PLAN                                                                  
----------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=288400.09..288400.09 rows=2 width=78)
   Sort Key: t1.es09numdoc, t1.es09tipdoc
   ->  Merge Left Join  (cost=46.22..288400.08 rows=2 width=78)
         Merge Cond: (t1.es09tipdoc = es09t1.es09tipdoc)
         Join Filter: ((es09t1.es09codemp = t1.es09codemp) AND (es09t1.es09numdoc = t1.es09numdoc))
         ->  Merge Left Join  (cost=45.66..48.43 rows=2 width=70)
               Merge Cond: (t1.es09tipdoc = t2.es08tipdoc)
               ->  Sort  (cost=9.19..9.19 rows=2 width=44)
                     Sort Key: t1.es09tipdoc
                     ->  Nested Loop  (cost=1.11..9.18 rows=2 width=44)
                           ->  Index Scan using ad_es09t_1 on es09t t3  (cost=0.56..4.58 rows=1 width=42)
                                 Index Cond: ((es09codemp = 1) AND (es09datreq >= '2016-02-02'::date) AND (es09datreq <= '2016-02-02'::date))
                                 Filter: (es09usuari ~~ '%%%%%%%%%%%%%%%%%%%%'::text)
                           ->  Index Scan using es09t1_pkey on es09t1 t1  (cost=0.56..4.59 rows=1 width=19)
                                 Index Cond: ((es09codemp = 1) AND (es09tipdoc = t3.es09tipdoc) AND (es09numdoc = t3.es09numdoc))
                                 Filter: (es09tipdoc ~~ '%%%%%'::text)
               ->  Sort  (cost=36.47..37.84 rows=549 width=32)
                     Sort Key: t2.es08tipdoc
                     ->  Seq Scan on es08t t2  (cost=0.00..11.49 rows=549 width=32)
         ->  Materialize  (cost=0.56..286496.26 rows=716037 width=23)
               ->  GroupAggregate  (cost=0.56..277545.79 rows=716037 width=15)
                     Group Key: es09t1.es09codemp, es09t1.es09tipdoc, es09t1.es09numdoc
                     ->  Index Only Scan using es09t1_pkey on es09t1  (cost=0.56..198781.81 rows=7160361 width=15)
                           Index Cond: (es09codemp = 1)
(24 rows)


----------------------------------------------------------------------------

db=# cluster es09t1;
CLUSTER

db=# explain (buffers,analyze) SELECT T1.es09item, T1.es09status, T3.es09usuari, T3.es09datreq, T2.es08desdoc AS es09desdoc, T1.es09numdoc, T1.es09tipdoc AS es09tipdoc, T1.es09codemp, COALESCE( T4.es09quatre, 0) AS es09quatre FROM (((ES09T1 T1 LEFT JOIN ES08T T2 ON T2.es08tipdoc = T1.es09tipdoc) LEFT JOIN ES09T T3 ON T3.es09codemp = T1.es09codemp AND T3.es09tipdoc = T1.es09tipdoc AND T3.es09numdoc = T1.es09numdoc) LEFT JOIN (SELECT COUNT(*) AS es09quatre, es09codemp, es09tipdoc, es09numdoc FROM ES09T1 GROUP BY es09codemp, es09tipdoc, es09numdoc ) T4 ON T4.es09codemp = T1.es09codemp AND T4.es09tipdoc = T1.es09tipdoc AND T4.es09numdoc = T1.es09numdoc) WHERE (T1.es09codemp = 1) and (T3.es09datreq >= '2016-02-02' and T3.es09datreq <= '2016-02-02') and (T3.es09usuari like '%%%%%%%%%%%%%%%%%%%%') and (T1.es09tipdoc like '%%%%%') ORDER BY T1.es09codemp, T1.es09numdoc DESC, T1.es09tipdoc;
                                                                       QUERY PLAN                                                                        
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=483816.33..483816.34 rows=2 width=78) (actual time=8794.833..8795.001 rows=2408 loops=1)
   Sort Key: t1.es09numdoc, t1.es09tipdoc
   Sort Method: quicksort  Memory: 435kB
   Buffers: shared hit=13649 read=299785
   ->  Nested Loop Left Join  (cost=461417.89..483816.32 rows=2 width=78) (actual time=6563.106..8790.845 rows=2408 loops=1)
         Buffers: shared hit=13649 read=299785
         ->  Hash Right Join  (cost=461417.61..483815.72 rows=2 width=52) (actual time=6563.082..8782.169 rows=2408 loops=1)
               Hash Cond: ((es09t1.es09codemp = t1.es09codemp) AND (es09t1.es09tipdoc = t1.es09tipdoc) AND (es09t1.es09numdoc = t1.es09numdoc))
               Buffers: shared hit=6425 read=299785
               ->  HashAggregate  (cost=461408.40..468575.79 rows=716739 width=15) (actual time=6548.467..7866.944 rows=3607578 loops=1)
                     Group Key: es09t1.es09codemp, es09t1.es09tipdoc, es09t1.es09numdoc
                     Buffers: shared hit=421 read=299566
                     ->  Seq Scan on es09t1  (cost=0.00..389734.56 rows=7167384 width=15) (actual time=2.154..1818.148 rows=7160931 loops=1)
                           Filter: (es09codemp = 1)
                           Rows Removed by Filter: 11849
                           Buffers: shared hit=421 read=299566
               ->  Hash  (cost=9.18..9.18 rows=2 width=44) (actual time=12.486..12.486 rows=2408 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 188kB
                     Buffers: shared hit=6004 read=219
                     ->  Nested Loop  (cost=1.11..9.18 rows=2 width=44) (actual time=0.076..11.112 rows=2408 loops=1)
                           Buffers: shared hit=6004 read=219
                           ->  Index Scan using ad_es09t_1 on es09t t3  (cost=0.56..4.58 rows=1 width=42) (actual time=0.035..0.743 rows=1212 loops=1)
                                 Index Cond: ((es09codemp = 1) AND (es09datreq >= '2016-02-02'::date) AND (es09datreq <= '2016-02-02'::date))
                                 Filter: (es09usuari ~~ '%%%%%%%%%%%%%%%%%%%%'::text)
                                 Buffers: shared hit=98 read=12
                           ->  Index Scan using es09t1_pkey on es09t1 t1  (cost=0.56..4.59 rows=1 width=19) (actual time=0.007..0.008 rows=2 loops=1212)
                                 Index Cond: ((es09codemp = 1) AND (es09tipdoc = t3.es09tipdoc) AND (es09numdoc = t3.es09numdoc))
                                 Filter: (es09tipdoc ~~ '%%%%%'::text)
                                 Buffers: shared hit=5906 read=207
         ->  Index Scan using es08t_pkey on es08t t2  (cost=0.28..0.29 rows=1 width=32) (actual time=0.002..0.003 rows=1 loops=2408)
               Index Cond: (es08tipdoc = t1.es09tipdoc)
               Buffers: shared hit=7224
 Planning time: 14.498 ms
 Execution time: 8819.824 ms
(34 rows)

 Best regards,

Alexandre


pgsql-performance by date:

Previous
From: Karl Denninger
Date:
Subject: Re: Big number of connections
Next
From: jarek
Date:
Subject: Re: Big number of connections