Thread: Query running longer

Query running longer

From
veem v
Date:
Hello All,
We have the below query which is running for ~45 seconds on postgres aurora reader instance. I have captured the explain analyze. Want to understand, where exactly the resources are getting spent and if we can be able to optimize it further. It's a UI query showing top 50 rows and is supposed to finish in sub seconds but it takes around ~45 seconds to finish. 
Also seeing multiple workers in the plan, does that mean the query is running in parallel somehow?

explain (analyze,verbose,costs,buffers)  select TAB1.PRI from SCHEMA1.TAB1 TAB1
inner join SCHEMA1.TAB4 TAB4 on TAB4.PRI = TAB1.PRI
inner join SCHEMA1."TAB2" TAB2 on TAB2.PRI = TAB1.PRI
inner join SCHEMA1.TAB3 a2 on a2.AID = TAB2.AID
       where TAB2.MID = XXXXX and TAB4.TAB4_code = 'XX'
         and TAB2.TAB2_time between '2024-01-01' and '2024-01-31'
       order by TAB2.TAB2_time desc
       limit 50;


Limit  (cost=13052924.01..13052924.14 rows=50 width=45) (actual time=45211.971..45224.720 rows=50 loops=1)
"  Output: TAB1.PRI, TAB2.TAB2_time"
  Buffers: shared hit=1980943 read=2335820
  I/O Timings: shared/local read=112477.014
  ->  Sort  (cost=13052924.01..13052924.19 rows=70 width=45) (actual time=45211.969..45224.713 rows=50 loops=1)
"        Output: TAB1.PRI, TAB2.TAB2_time"
        Sort Key: TAB2.TAB2_time DESC
        Sort Method: top-N heapsort  Memory: 32kB
        Buffers: shared hit=1980943 read=2335820
        I/O Timings: shared/local read=112477.014
        ->  Gather  (cost=92917.38..13052921.87 rows=70 width=45) (actual time=947.004..45221.915 rows=5428 loops=1)
"              Output: TAB1.PRI, TAB2.TAB2_time"
              Workers Planned: 4
              Workers Launched: 4
              Buffers: shared hit=1980943 read=2335820
              I/O Timings: shared/local read=112477.014
              ->  Nested Loop  (cost=91917.38..13051914.87 rows=18 width=45) (actual time=945.946..45195.224 rows=1086 loops=5)
"                    Output: TAB1.PRI, TAB2.TAB2_time"
                    Inner Unique: true
                    Buffers: shared hit=1980943 read=2335820
                    I/O Timings: shared/local read=112477.014
                    Worker 0:  actual time=936.808..45193.518 rows=1036 loops=1
                      Buffers: shared hit=382606 read=465076
                      I/O Timings: shared/local read=22452.028
                    Worker 1:  actual time=947.246..45194.168 rows=1055 loops=1
                      Buffers: shared hit=383165 read=484189
                      I/O Timings: shared/local read=22617.135
                    Worker 2:  actual time=933.623..45192.534 rows=1145 loops=1
                      Buffers: shared hit=415758 read=473182
                      I/O Timings: shared/local read=22741.488
                    Worker 3:  actual time=965.639..45193.603 rows=1078 loops=1
                      Buffers: shared hit=398009 read=449053
                      I/O Timings: shared/local read=22221.094
                    ->  Nested Loop  (cost=91916.81..13051828.80 rows=18 width=81) (actual time=945.917..43729.931 rows=1086 loops=5)
"                          Output: TAB1.PRI, TAB2.TAB2_time, TAB2.AID"
                          Inner Unique: true
                          Join Filter: ((TAB4.PRI)::text = (TAB1.PRI)::text)
                          Buffers: shared hit=1962289 read=2328363
                          I/O Timings: shared/local read=105246.220
                          Worker 0:  actual time=936.781..43732.652 rows=1036 loops=1
                            Buffers: shared hit=379077 read=463587
                            I/O Timings: shared/local read=21008.508
                          Worker 1:  actual time=947.212..43699.507 rows=1055 loops=1
                            Buffers: shared hit=379573 read=482704
                            I/O Timings: shared/local read=21142.572
                          Worker 2:  actual time=933.589..43696.710 rows=1145 loops=1
                            Buffers: shared hit=411836 read=471634
                            I/O Timings: shared/local read=21266.581
                          Worker 3:  actual time=965.608..43768.535 rows=1078 loops=1
                            Buffers: shared hit=394288 read=447583
                            I/O Timings: shared/local read=20814.288
                          ->  Parallel Hash Join  (cost=91916.24..13051765.39 rows=18 width=117) (actual time=945.879..42758.939 rows=1086 loops=5)
"                                Output: TAB4.PRI, TAB2.TAB2_time, TAB2.PRI, TAB2.AID"
                                Hash Cond: ((TAB2.PRI)::text = (TAB4.PRI)::text)
                                Buffers: shared hit=1943792 read=2322814
                                I/O Timings: shared/local read=100496.787
                                Worker 0:  actual time=936.743..42798.247 rows=1036 loops=1
                                  Buffers: shared hit=375573 read=462501
                                  I/O Timings: shared/local read=20094.654
                                Worker 1:  actual time=947.169..42752.987 rows=1055 loops=1
                                  Buffers: shared hit=375975 read=481619
                                  I/O Timings: shared/local read=20216.926
                                Worker 2:  actual time=933.545..42660.854 rows=1145 loops=1
                                  Buffers: shared hit=407956 read=470465
                                  I/O Timings: shared/local read=20252.386
                                Worker 3:  actual time=965.567..42797.288 rows=1078 loops=1
                                  Buffers: shared hit=390609 read=446481
                                  I/O Timings: shared/local read=19863.965
"                                ->  Parallel Bitmap Heap Scan on SCHEMA1.""TAB2"" TAB2  (cost=84860.50..13040301.00 rows=1175611 width=80) (actual time=713.054..26942.082 rows=956249 loops=5)"
"                                      Output: TAB2.TAB2_time, TAB2.PRI, TAB2.AID"
                                      Recheck Cond: (TAB2.MID = 'XXXXX'::numeric)
                                      Rows Removed by Index Recheck: 2137395
                                      Filter: ((TAB2.TAB2_time >= '2024-01-01 00:00:00+00'::timestamp with time zone) AND (TAB2.TAB2_time <= '2024-01-31 00:00:00+00'::timestamp with time zone))
                                      Heap Blocks: exact=5300 lossy=782577
                                      Buffers: shared hit=1651569 read=2245157
                                      I/O Timings: shared/local read=29063.286
                                      Worker 0:  actual time=713.040..27006.980 rows=942051 loops=1
                                        Buffers: shared hit=317611 read=447013
                                        I/O Timings: shared/local read=5851.688
                                      Worker 1:  actual time=713.047..27065.878 rows=939696 loops=1
                                        Buffers: shared hit=317632 read=466176
                                        I/O Timings: shared/local read=6038.851
                                      Worker 2:  actual time=713.027..26894.506 rows=967468 loops=1
                                        Buffers: shared hit=349596 read=454912
                                        I/O Timings: shared/local read=5962.348
                                      Worker 3:  actual time=713.091..26826.767 rows=961928 loops=1
                                        Buffers: shared hit=332980 read=430848
                                        I/O Timings: shared/local read=5426.475
                                      ->  Bitmap Index Scan on TAB2_idx2  (cost=0.00..83684.89 rows=4702443 width=0) (actual time=688.661..688.661 rows=4781245 loops=1)
                                            Index Cond: (TAB2.MID = 'XXXXX'::numeric)
                                            Buffers: shared hit=12408
                                            Worker 2:  actual time=688.661..688.661 rows=4781245 loops=1
                                              Buffers: shared hit=12408
                                ->  Parallel Hash  (cost=7042.63..7042.63 rows=1049 width=37) (actual time=217.987..217.988 rows=27613 loops=5)
                                      Output: TAB4.PRI
                                      Buckets: 262144 (originally 2048)  Batches: 1 (originally 1)  Memory Usage: 13936kB
                                      Buffers: shared hit=134917
                                      Worker 0:  actual time=214.981..214.982 rows=27779 loops=1
                                        Buffers: shared hit=27133
                                      Worker 1:  actual time=215.455..215.456 rows=27805 loops=1
                                        Buffers: shared hit=27159
                                      Worker 2:  actual time=215.774..215.774 rows=27330 loops=1
                                        Buffers: shared hit=26706
                                      Worker 3:  actual time=215.776..215.777 rows=26880 loops=1
                                        Buffers: shared hit=26245
                                      ->  Parallel Bitmap Heap Scan on SCHEMA1.TAB4 TAB4  (cost=26.39..7042.63 rows=1049 width=37) (actual time=23.650..201.606 rows=27613 loops=5)
                                            Output: TAB4.PRI
                                            Recheck Cond: ((TAB4.TAB4_code)::text = 'XX'::text)
                                            Rows Removed by Index Recheck: 616610
                                            Heap Blocks: exact=11978 lossy=15624
                                            Buffers: shared hit=134917
                                            Worker 0:  actual time=20.627..199.852 rows=27779 loops=1
                                              Buffers: shared hit=27133
                                            Worker 1:  actual time=21.065..199.786 rows=27805 loops=1
                                              Buffers: shared hit=27159
                                            Worker 2:  actual time=21.445..198.582 rows=27330 loops=1
                                              Buffers: shared hit=26706
                                            Worker 3:  actual time=21.470..195.915 rows=26880 loops=1
                                              Buffers: shared hit=26245
                                            ->  Bitmap Index Scan on TAB4_idx1  (cost=0.00..25.95 rows=1784 width=0) (actual time=23.938..23.938 rows=138067 loops=1)
                                                  Index Cond: ((TAB4.TAB4_code)::text = 'XX'::text)
                                                  Buffers: shared hit=72
                          ->  Index Only Scan using TAB1_pk on SCHEMA1.TAB1 TAB1  (cost=0.57..3.51 rows=1 width=37) (actual time=0.891..0.891 rows=1 loops=5428)
                                Output: TAB1.PRI
                                Index Cond: (TAB1.PRI = (TAB2.PRI)::text)
                                Heap Fetches: 0
                                Buffers: shared hit=18262 read=5549
                                I/O Timings: shared/local read=4749.434
                                Worker 0:  actual time=0.899..0.899 rows=1 loops=1036
                                  Buffers: shared hit=3464 read=1086
                                  I/O Timings: shared/local read=913.854
                                Worker 1:  actual time=0.894..0.894 rows=1 loops=1055
                                  Buffers: shared hit=3558 read=1085
                                  I/O Timings: shared/local read=925.646
                                Worker 2:  actual time=0.901..0.901 rows=1 loops=1145
                                  Buffers: shared hit=3840 read=1169
                                  I/O Timings: shared/local read=1014.196
                                Worker 3:  actual time=0.898..0.898 rows=1 loops=1078
                                  Buffers: shared hit=3634 read=1102
                                  I/O Timings: shared/local read=950.323
                    ->  Index Only Scan using TAB3_pk on SCHEMA1.TAB3 a2  (cost=0.57..4.78 rows=1 width=36) (actual time=1.336..1.336 rows=1 loops=5428)
                          Output: a2.AID
                          Index Cond: (a2.AID = (TAB2.AID)::text)
                          Heap Fetches: 1836
                          Buffers: shared hit=18278 read=7398
                          I/O Timings: shared/local read=7172.664
                          Worker 0:  actual time=1.393..1.393 rows=1 loops=1036
                            Buffers: shared hit=3455 read=1473
                            I/O Timings: shared/local read=1429.250
                          Worker 1:  actual time=1.405..1.405 rows=1 loops=1055
                            Buffers: shared hit=3531 read=1476
                            I/O Timings: shared/local read=1464.637
                          Worker 2:  actual time=1.296..1.296 rows=1 loops=1145
                            Buffers: shared hit=3857 read=1538
                            I/O Timings: shared/local read=1465.583
                          Worker 3:  actual time=1.309..1.309 rows=1 loops=1078
                            Buffers: shared hit=3642 read=1459
                            I/O Timings: shared/local read=1395.946
Query Identifier: 7231829541130579109
Planning:
  Buffers: shared hit=1414
Planning Time: 1.305 ms
Execution Time: 45224.792 ms

Re: Query running longer

From
Laurenz Albe
Date:
On Fri, 2024-02-02 at 02:27 +0530, veem v wrote:
> We have the below query which is running for ~45 seconds on postgres aurora reader instance.
> I have captured the explain analyze. Want to understand, where exactly the resources are
> getting spent and if we can be able to optimize it further.

Aurora <> PostgreSQL, but here is what I can see:

- The index scan on SCHEMA1."TAB2" has to check 2 million extra
  rows because "work_mem" is too small.  Almost the complete time
  is spent there.

- You may be getting a bad plan, because the statistics on
  SCHEMA1.TAB4 are either out of date or not detailed enough,
  which makes PostgreSQL underestimate the result size.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com



Re: Query running longer

From
Ron Johnson
Date:
On Thu, Feb 1, 2024 at 4:13 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Fri, 2024-02-02 at 02:27 +0530, veem v wrote:
> We have the below query which is running for ~45 seconds on postgres aurora reader instance.
> I have captured the explain analyze. Want to understand, where exactly the resources are
> getting spent and if we can be able to optimize it further.

Aurora <> PostgreSQL, but here is what I can see:

- The index scan on SCHEMA1."TAB2" has to check 2 million extra
  rows because "work_mem" is too small.  Almost the complete time
  is spent there.

Is that these numbers?
Buffers: shared hit=xxxxxx read=22xxxxx

Re: Query running longer

From
veem v
Date:
On Fri, 2 Feb 2024 at 02:43, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Fri, 2024-02-02 at 02:27 +0530, veem v wrote:
> We have the below query which is running for ~45 seconds on postgres aurora reader instance.
> I have captured the explain analyze. Want to understand, where exactly the resources are
> getting spent and if we can be able to optimize it further.

Aurora <> PostgreSQL, but here is what I can see:

- The index scan on SCHEMA1."TAB2" has to check 2 million extra
  rows because "work_mem" is too small.  Almost the complete time
  is spent there.

- You may be getting a bad plan, because the statistics on
  SCHEMA1.TAB4 are either out of date or not detailed enough,
  which makes PostgreSQL underestimate the result size.


Thank you so much.

I can read and understand the execution plan in Oracle database easily and can also interpret database parameters too. But new to postgres, are there some good resources to quickly get up to speed to understand the postgres execution path here and also various key parameters(like for e.g. one you mentioned as work_mem)?

As you mentioned below, So wondering how you got to know, if this is the step where majority of the DB resources get spent. And as total time the query ran was ~45 seconds, and out of that how much time it spent in this step, how can i get that? And to fix this one line with regards to TAB2, should we increase "work_mem" here?

- The index scan on SCHEMA1."TAB2" has to check 2 million extra
  rows because "work_mem" is too small.  Almost the complete time
  is spent there.

 ->  Parallel Bitmap Heap Scan on SCHEMA1.""TAB2"" TAB2  (cost=84860.50..13040301.00 rows=1175611 width=80) (actual time=713.054..26942.082 rows=956249 loops=5)"
"                     Output: TAB2.TAB2_time, TAB2.PRI, TAB2.AID"
 Recheck Cond: (TAB2.MID = 'XXXXX'::numeric)
 Rows Removed by Index Recheck: 2137395
 Filter: ((TAB2.TAB2_time >= '2024-01-01 00:00:00+00'::timestamp with time zone) AND (TAB2.TAB2_time <= '2024-01-31 00:00:00+00'::timestamp with time zone))
 Heap Blocks: exact=5300 lossy=782577
 Buffers: shared hit=1651569 read=2245157
 I/O Timings: shared/local read=29063.286
  

Another point you mentioned as below , for this , should we run vacuum analyze on the table TAB4?

- You may be getting a bad plan, because the statistics on
  SCHEMA1.TAB4 are either out of date or not detailed enough,
  which makes PostgreSQL underestimate the result size.

  
Output: TAB4.PRI
 Buckets: 262144 (originally 2048)  Batches: 1 (originally 1)  Memory Usage: 13936kB
 Buffers: shared hit=134917
 Worker 0:  actual time=214.981..214.982 rows=27779 loops=1
Buffers: shared hit=27133
 Worker 1:  actual time=215.455..215.456 rows=27805 loops=1
Buffers: shared hit=27159
 Worker 2:  actual time=215.774..215.774 rows=27330 loops=1
Buffers: shared hit=26706
 Worker 3:  actual time=215.776..215.777 rows=26880 loops=1
Buffers: shared hit=26245
 ->  Parallel Bitmap Heap Scan on SCHEMA1.TAB4 TAB4  (cost=26.39..7042.63 rows=1049 width=37) (actual time=23.650..201.606 rows=27613 loops=5)
Output: TAB4.PRI
Recheck Cond: ((TAB4.TAB4_code)::text = 'XX'::text)
Rows Removed by Index Recheck: 616610
Heap Blocks: exact=11978 lossy=15624
Buffers: shared hit=134917
Worker 0:  actual time=20.627..199.852 rows=27779 loops=1
 Buffers: shared hit=27133
Worker 1:  actual time=21.065..199.786 rows=27805 loops=1
 Buffers: shared hit=27159
Worker 2:  actual time=21.445..198.582 rows=27330 loops=1
 Buffers: shared hit=26706
Worker 3:  actual time=21.470..195.915 rows=26880 loops=1
 Buffers: shared hit=26245
->  Bitmap Index Scan on TAB4_idx1  (cost=0.00..25.95 rows=1784 width=0) (actual time=23.938..23.938 rows=138067 loops=1)
 Index Cond: ((TAB4.TAB4_code)::text = 'XX'::text)
 Buffers: shared hit=72




Re: Query running longer

From
Laurenz Albe
Date:
On Fri, 2024-02-02 at 10:14 +0530, veem v wrote:
> On Fri, 2 Feb 2024 at 02:43, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> > On Fri, 2024-02-02 at 02:27 +0530, veem v wrote:
> > > We have the below query which is running for ~45 seconds on postgres aurora reader instance.
> > > I have captured the explain analyze. Want to understand, where exactly the resources are
> > > getting spent and if we can be able to optimize it further.
> >
> > Aurora <> PostgreSQL, but here is what I can see:
> >
> > - The index scan on SCHEMA1."TAB2" has to check 2 million extra
> >   rows because "work_mem" is too small.  Almost the complete time
> >   is spent there.
> >
> > - You may be getting a bad plan, because the statistics on
> >   SCHEMA1.TAB4 are either out of date or not detailed enough,
> >   which makes PostgreSQL underestimate the result size.
>
> As you mentioned below, So wondering how you got to know, if this is the step where majority of the DB resources get
spent.And as total time the query ran was ~45 seconds, and out of that how much time it spent in this step, how can i
getthat? And to fix this one line with regards to TAB2, should we increase "work_mem" here? 
>
> - The index scan on SCHEMA1."TAB2" has to check 2 million extra
>   rows because "work_mem" is too small.  Almost the complete time
>   is spent there.
>
>  ->  Parallel Bitmap Heap Scan on SCHEMA1.""TAB2"" TAB2  (cost=84860.50..13040301.00 rows=1175611 width=80) (actual
time=713.054..26942.082rows=956249 loops=5)" 
> "                     Output: TAB2.TAB2_time, TAB2.PRI, TAB2.AID"
>   Recheck Cond: (TAB2.MID = 'XXXXX'::numeric)
>   Rows Removed by Index Recheck: 2137395
>   Filter: ((TAB2.TAB2_time >= '2024-01-01 00:00:00+00'::timestamp with time zone) AND (TAB2.TAB2_time <= '2024-01-31
00:00:00+00'::timestampwith time zone)) 
>   Heap Blocks: exact=5300 lossy=782577
>   Buffers: shared hit=1651569 read=2245157
>   I/O Timings: shared/local read=29063.286  

The bitmap index scan builds a bitmap, the size of which is limited by "work_mem".
If that is big enough, the bitmap will contain a bit for each table row, if not,
it only contains a bit per table row for some table blocks (the "exact" ones),
then it degrades to a bit per block (whe "lossy" ones).

For lossy blocks, *all* rows in the block have to be rechecked, which is
overhead.  Given that only a small part of the time (26942.082 - 713.054) * 5
is spent doing I/O (29063.286), I guess that the rest is spent processing
table rows.

> Another point you mentioned as below , for this , should we run vacuum analyze on the table TAB4?
>
> - You may be getting a bad plan, because the statistics on
>   SCHEMA1.TAB4 are either out of date or not detailed enough,
>   which makes PostgreSQL underestimate the result size.
>   
>   ->  Parallel Bitmap Heap Scan on SCHEMA1.TAB4 TAB4  (cost=26.39..7042.63 rows=1049 width=37) (actual
time=23.650..201.606rows=27613 loops=5) 
>        ->  Bitmap Index Scan on TAB4_idx1  (cost=0.00..25.95 rows=1784 width=0) (actual time=23.938..23.938
rows=138067loops=1) 
>            Index Cond: ((TAB4.TAB4_code)::text = 'XX'::text)
>            Buffers: shared hit=72

An ANALYZE might be enough.

If not, you can try to collect more detailed statistics for the column:

  ALTER TABLE SCHEMA1.TAB4 ALTER TAB4_code SET STATISTICS 1000;
  ANALYZE SCHEMA1.TAB4;

Yours,
Laurenz Albe