Re: How to improve the performance of my SQL query? - Mailing list pgsql-general

From jian he
Subject Re: How to improve the performance of my SQL query?
Date
Msg-id CACJufxEDxWZO9HbZ8Vqj-BncvqX3y3SqLr+0b23ekj=GkTpAUA@mail.gmail.com
Whole thread Raw
In response to Re: How to improve the performance of my SQL query?  (gzh <gzhcoder@126.com>)
Responses Re: How to improve the performance of my SQL query?
List pgsql-general
On Mon, Jul 24, 2023 at 5:54 PM gzh <gzhcoder@126.com> wrote:
>
> >Did you change any parameters that have an impact on query planning?
>
> >You can see that in the output of EXPLAIN (ANALYZE, BUFFERS, SETTINGS).
>
> I added some parameters and re-executed the Execution Plan.
>
> Except for the index not taking effect, I still don't know the reason why the index is not working.
>
> Is it because there is too much data that meets the conditions?
>
>
> EXPLAIN (ANALYZE ON, VERBOSE ON, COSTS ON, BUFFERS ON)
>
> select COUNT(ET_CD)
>
> from TBL_SHA
>
> WHERE TBL_SHA.MS_CD = 'MLD009'
>
> and TBL_SHA.ETRYS in
>
>    (select TBL_INF.RY_CD
>
>     from TBL_INF
>
>     WHERE TBL_INF.MS_CD = 'MLD009'
>
>    AND TBL_INF.RY_CD = '00000001'
>
>    )
>
>
> ----- Execution Plan -----
>
> Limit  (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.521..128717.677 rows=1 loops=1)
>
>   Output: (count(tbl_sha.et_cd))
>
>   Buffers: shared hit=58948 read=2112758
>
>   I/O Timings: read=357249.120
>
>   ->  Aggregate  (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.519..128717.674 rows=1 loops=1)
>
>         Output: count(tbl_sha.et_cd)
>
>         Buffers: shared hit=58948 read=2112758
>
>         I/O Timings: read=357249.120
>
>         ->  Nested Loop  (cost=1000.29..2722556.76 rows=2430587 width=9) (actual time=2.364..128350.279 rows=2613500
loops=1)
>
>               Output: tbl_sha.et_cd
>
>               Buffers: shared hit=58948 read=2112758
>
>               I/O Timings: read=357249.120
>
>               ->  Index Only Scan using tbl_inf_pkc on mtpdb.tbl_inf  (cost=0.29..8.31 rows=1 width=9) (actual
time=0.046..0.051rows=1 loops=1) 
>
>                     Output: tbl_inf.ms_cd, tbl_inf.ry_cd
>
>                     Index Cond: ((tbl_inf.ms_cd = 'MLD009'::bpchar) AND (tbl_inf.ry_cd = '00000001'::bpchar))
>
>                     Heap Fetches: 1
>
>                     Buffers: shared hit=4
>
>               ->  Gather  (cost=1000.00..2698242.58 rows=2430587 width=18) (actual time=2.315..127773.087
rows=2613500loops=1) 
>
>                     Output: tbl_sha.et_cd, tbl_sha.etrys
>
>                     Workers Planned: 2
>
>                     Workers Launched: 2
>
>                     Buffers: shared hit=58944 read=2112758
>
>                     I/O Timings: read=357249.120
>
>                     ->  Parallel Seq Scan on mtpdb.tbl_sha  (cost=0.00..2454183.88 rows=1012745 width=18) (actual
time=952.728..127583.089rows=871167 loops=3) 
>
>                           Output: tbl_sha.et_cd, tbl_sha.etrys
>
>                           Filter: ((tbl_sha.ms_cd = 'MLD009'::bpchar) AND (tbl_sha.etrys = '00000001'::bpchar))
>
>                           Rows Removed by Filter: 14062278
>
>                           Buffers: shared hit=58944 read=2112758
>
>                           I/O Timings: read=357249.120
>
>                           Worker 0:  actual time=1432.292..127762.181 rows=988036 loops=1
>
>                             Buffers: shared hit=17875 read=706862
>
>                             I/O Timings: read=119193.744
>
>                           Worker 1:  actual time=1425.878..127786.777 rows=992381 loops=1
>
>                             Buffers: shared hit=19813 read=706359
>
>                             I/O Timings: read=119386.899
>
> Planning:
>
>   Buffers: shared hit=42
>
> Planning Time: 1.024 ms
>
> Execution Time: 128717.731 ms
>
>

I think the whole query can just:
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = 'MLD009' AND TBL_SHA.ETRYS = '00000001';

> and TBL_SHA.ETRYS in
>    (select TBL_INF.RY_CD
>     from TBL_INF
>     WHERE TBL_INF.MS_CD = 'MLD009'
>    AND TBL_INF.RY_CD = '00000001'
>    )

if subquery after IN clause part, no rows returned then the whole
query would return zero row.
if many duplicates rows returned, then there is no point of evaluate
something like {1 in (1,1,1,1,1,1,)}



pgsql-general by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: \d don't print all the tables
Next
From: Ron
Date:
Subject: Re: Grant all privileges to user on a database