PostgreSQL 12.8 Same Query Same Execution Plan Different Time - Mailing list pgsql-performance

From Ludwig Isaac Lim
Subject PostgreSQL 12.8 Same Query Same Execution Plan Different Time
Date
Msg-id 1017307461.1415193.1642600375850@mail.yahoo.com
Whole thread Raw
Responses Re: PostgreSQL 12.8 Same Query Same Execution Plan Different Time  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-performance
Hello:

I ran 2 same queries on PostgreSQL 12.8 machine running in AWS RDS, the first time I ran the query

the query plan was:

GroupAggregate  (cost=455652.07..455664.99 rows=340 width=16) (actual time=124047.119..124048.777 rows=294 loops=1)
   Group Key: dvh.history_date
   ->  Sort  (cost=455652.07..455655.24 rows=1269 width=8) (actual time=124046.989..124047.857 rows=7780 loops=1)
         Sort Key: dvh.history_date
         Sort Method: quicksort  Memory: 557kB
         ->  Nested Loop  (cost=13708.92..455586.66 rows=1269 width=8) (actual time=12.228..124039.597 rows=7780
loops=1)
               ->  HashAggregate  (cost=13708.35..13746.45 rows=3810 width=4) (actual time=12.210..21.629 rows=6119
loops=1)
                     Group Key: pdv.id
                     ->  Nested Loop  (cost=0.84..13698.83 rows=3810 width=4) (actual time=0.038..10.650 rows=6119
loops=1)
                           ->  Index Scan using idx_pc_device_acct_num on pc_devices  (cost=0.42..565.32 rows=506
width=4)(actual time=0.014..0.193 rows=396 loops=1) 
                                 Index Cond: ((account_number)::text = 'AB32823833'::text)
                           ->  Index Scan using pdv_uindex on pdv  (cost=0.42..25.73 rows=23 width=8) (actual
time=0.007..0.024rows=15 loops=396) 
                                 Index Cond: (pc_device_id = pc_devices.id)
               ->  Index Scan using idx_pc_dvh_dvh_id on pdv_history dvh  (cost=0.57..115.96 rows=1 width=12) (actual
time=0.342..20.265rows=1 loops=6119) 
                     Index Cond: (pdv_id = pdv.id)
                     Filter: (status_changed AND (status_id = 1))
                     Rows Removed by Filter: 187
Planning Time: 1.050 ms
Execution Time: 124048.850 ms


After that, I reran the same query again. The plan is basically the same:


GroupAggregate  (cost=455652.07..455664.99 rows=340 width=16) (actual time=12180.624..12182.286 rows=294 loops=1)
   Group Key: dvh.history_date
   ->  Sort  (cost=455652.07..455655.24 rows=1269 width=8) (actual time=12180.493..12181.363 rows=7780 loops=1)
         Sort Key: dvh.history_date
         Sort Method: quicksort  Memory: 557kB
         ->  Nested Loop  (cost=13708.92..455586.66 rows=1269 width=8) (actual time=1709.341..12177.249 rows=7780
loops=1)
               ->  HashAggregate  (cost=13708.35..13746.45 rows=3810 width=4) (actual time=1709.319..1713.171 rows=6119
loops=1)
                     Group Key: pdv.id
                     ->  Nested Loop  (cost=0.84..13698.83 rows=3810 width=4) (actual time=0.379..1706.606 rows=6119
loops=1)
                           ->  Index Scan using idx_pc_device_acct_num on pc_devices  (cost=0.42..565.32 rows=506
width=4)(actual time=0.013..0.279 rows=396 loops=1) 
                                 Index Cond: ((account_number)::text = 'AB32823833'::text)
                           ->  Index Scan using pdv_uindex on pdv  (cost=0.42..25.73 rows=23 width=8) (actual
time=0.289..4.306rows=15 loops=396) 
                                 Index Cond: (pc_device_id = pc_devices.id)
               ->  Index Scan using idx_pc_dvh_dvh_id on pdv_history dvh  (cost=0.57..115.96 rows=1 width=12) (actual
time=0.063..1.709rows=1 loops=6119) 
                     Index Cond: (pdv_id = pdv.id)
                     Filter: (status_changed AND (status_id = 1))
                     Rows Removed by Filter: 187
Planning Time: 1.262 ms
Execution Time: 12182.361 ms


But the gap in the execution time between the two same queries is quite huge : 2 minutes vs 12 seconds.

I noticed that different is actually in Nested Loop join. One is taking 2 minutes, other is taking 12 seconds. I find
thispuzzling as I assume the nested loop should be done in memory. 

The disk is gp2 SDD so I'm even more baffled by this. What could be the factors that affect the speed of nested loop. I
noticefor that both loops the rows is 7780 and loops is 1. I don't think those are big numbers 

It was only after the running the 2 queries that I realize I could do EXPLAIN (ANALYZE, BUFFERS), but I couldn't
reproducethe slowness. 

Below are other information that might be relevant:

The database has been vacuum analyzed before running the queries.

Platform : AWS RDS
PG version : 12.8
effective_cache_size : 7.9 GB (7935800kB)
shared_buffers : 3.9 GB (3967896kB)
work_mem : 64MB
random_page_cost : 1.1
Instance type : db.m6g.xlarge (4 vCPUs / 32 GB RAM)
Database is idle. I'm the only one running the query
version : PostgreSQL 12.8 on aarch64-unknown-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6), 64-bit

Thanks,
Ludwig



pgsql-performance by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: Unique constraint blues
Next
From: "David G. Johnston"
Date:
Subject: Re: PostgreSQL 12.8 Same Query Same Execution Plan Different Time