Home > mailing lists

Re: How to improve the performance of my SQL query？ - Mailing list pgsql-general

From	gzh
Subject	Re: How to improve the performance of my SQL query？
Date	July 25, 2023 03:58:42
Msg-id	77fe4763.2f4d.1898b32fdd2.Coremail.gzhcoder@126.com Whole thread Raw
In response to	Re: How to improve the performance of my SQL query？ (jian he <jian.universality@gmail.com>)
List	pgsql-general

Tree view

Thank you for your reply.

>I think the whole query can just:
>select COUNT(ET_CD)
>from TBL_SHA
>WHERE TBL_SHA.MS_CD = 'MLD009' AND TBL_SHA.ETRYS = '00000001';

This is related to the business logic.

>if many duplicates rows returned, then there is no point of evaluate
>something like {1 in (1,1,1,1,1,1,)}

Because the primary key of the 'tbl_inf' table only consists of 'ms_cd' and 'ry_cd' columns, the subquery will not return duplicate rows.


At 2023-07-24 22:42:01, "jian he" <jian.universality@gmail.com> wrote:
>On Mon, Jul 24, 2023 at 5:54 PM gzh <gzhcoder@126.com> wrote:
>>
>> >Did you change any parameters that have an impact on query planning?
>>
>> >You can see that in the output of EXPLAIN (ANALYZE, BUFFERS, SETTINGS).
>>
>> I added some parameters and re-executed the Execution Plan.
>>
>> Except for the index not taking effect, I still don't know the reason why the index is not working.
>>
>> Is it because there is too much data that meets the conditions?
>>
>>
>> EXPLAIN (ANALYZE ON, VERBOSE ON, COSTS ON, BUFFERS ON)
>>
>> select COUNT(ET_CD)
>>
>> from TBL_SHA
>>
>> WHERE TBL_SHA.MS_CD = 'MLD009'
>>
>> and TBL_SHA.ETRYS in
>>
>>    (select TBL_INF.RY_CD
>>
>>     from TBL_INF
>>
>>     WHERE TBL_INF.MS_CD = 'MLD009'
>>
>>    AND TBL_INF.RY_CD = '00000001'
>>
>>    )
>>
>>
>> ----- Execution Plan -----
>>
>> Limit  (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.521..128717.677 rows=1 loops=1)
>>
>>   Output: (count(tbl_sha.et_cd))
>>
>>   Buffers: shared hit=58948 read=2112758
>>
>>   I/O Timings: read=357249.120
>>
>>   ->  Aggregate  (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.519..128717.674 rows=1 loops=1)
>>
>>         Output: count(tbl_sha.et_cd)
>>
>>         Buffers: shared hit=58948 read=2112758
>>
>>         I/O Timings: read=357249.120
>>
>>         ->  Nested Loop  (cost=1000.29..2722556.76 rows=2430587 width=9) (actual time=2.364..128350.279 rows=2613500 loops=1)
>>
>>               Output: tbl_sha.et_cd
>>
>>               Buffers: shared hit=58948 read=2112758
>>
>>               I/O Timings: read=357249.120
>>
>>               ->  Index Only Scan using tbl_inf_pkc on mtpdb.tbl_inf  (cost=0.29..8.31 rows=1 width=9) (actual time=0.046..0.051 rows=1 loops=1)
>>
>>                     Output: tbl_inf.ms_cd, tbl_inf.ry_cd
>>
>>                     Index Cond: ((tbl_inf.ms_cd = 'MLD009'::bpchar) AND (tbl_inf.ry_cd = '00000001'::bpchar))
>>
>>                     Heap Fetches: 1
>>
>>                     Buffers: shared hit=4
>>
>>               ->  Gather  (cost=1000.00..2698242.58 rows=2430587 width=18) (actual time=2.315..127773.087 rows=2613500 loops=1)
>>
>>                     Output: tbl_sha.et_cd, tbl_sha.etrys
>>
>>                     Workers Planned: 2
>>
>>                     Workers Launched: 2
>>
>>                     Buffers: shared hit=58944 read=2112758
>>
>>                     I/O Timings: read=357249.120
>>
>>                     ->  Parallel Seq Scan on mtpdb.tbl_sha  (cost=0.00..2454183.88 rows=1012745 width=18) (actual time=952.728..127583.089 rows=871167 loops=3)
>>
>>                           Output: tbl_sha.et_cd, tbl_sha.etrys
>>
>>                           Filter: ((tbl_sha.ms_cd = 'MLD009'::bpchar) AND (tbl_sha.etrys = '00000001'::bpchar))
>>
>>                           Rows Removed by Filter: 14062278
>>
>>                           Buffers: shared hit=58944 read=2112758
>>
>>                           I/O Timings: read=357249.120
>>
>>                           Worker 0:  actual time=1432.292..127762.181 rows=988036 loops=1
>>
>>                             Buffers: shared hit=17875 read=706862
>>
>>                             I/O Timings: read=119193.744
>>
>>                           Worker 1:  actual time=1425.878..127786.777 rows=992381 loops=1
>>
>>                             Buffers: shared hit=19813 read=706359
>>
>>                             I/O Timings: read=119386.899
>>
>> Planning:
>>
>>   Buffers: shared hit=42
>>
>> Planning Time: 1.024 ms
>>
>> Execution Time: 128717.731 ms
>>
>>
>
>I think the whole query can just:
>select COUNT(ET_CD)
>from TBL_SHA
>WHERE TBL_SHA.MS_CD = 'MLD009' AND TBL_SHA.ETRYS = '00000001';
>
>> and TBL_SHA.ETRYS in
>>    (select TBL_INF.RY_CD
>>     from TBL_INF
>>     WHERE TBL_INF.MS_CD = 'MLD009'
>>    AND TBL_INF.RY_CD = '00000001'
>>    )
>
>if subquery after IN clause part, no rows returned then the whole
>query would return zero row.
>if many duplicates rows returned, then there is no point of evaluate
>something like {1 in (1,1,1,1,1,1,)}

pgsql-general by date:

From: gzh
Date: 25 July 2023, 03:11:13
Subject: Re: How to improve the performance of my SQL query？

From: Laurenz Albe
Date: 25 July 2023, 05:56:25
Subject: Re: How to improve the performance of my SQL query？

Re: How to improve the performance of my SQL query？ - Mailing list pgsql-general

Previous

Next