Re: How to improve the performance of my SQL query? - Mailing list pgsql-general
From | gzh |
---|---|
Subject | Re: How to improve the performance of my SQL query? |
Date | |
Msg-id | 77fe4763.2f4d.1898b32fdd2.Coremail.gzhcoder@126.com Whole thread Raw |
In response to | Re: How to improve the performance of my SQL query? (jian he <jian.universality@gmail.com>) |
List | pgsql-general |
Thank you for your reply.
>I think the whole query can just: >select COUNT(ET_CD) >from TBL_SHA >WHERE TBL_SHA.MS_CD = 'MLD009' AND TBL_SHA.ETRYS = '00000001';
This is related to the business logic.
>if many duplicates rows returned, then there is no point of evaluate >something like {1 in (1,1,1,1,1,1,)}
Because the primary key of the 'tbl_inf' table only consists of 'ms_cd' and 'ry_cd' columns, the subquery will not return duplicate rows.
At 2023-07-24 22:42:01, "jian he" <jian.universality@gmail.com> wrote: >On Mon, Jul 24, 2023 at 5:54 PM gzh <gzhcoder@126.com> wrote: >> >> >Did you change any parameters that have an impact on query planning? >> >> >You can see that in the output of EXPLAIN (ANALYZE, BUFFERS, SETTINGS). >> >> I added some parameters and re-executed the Execution Plan. >> >> Except for the index not taking effect, I still don't know the reason why the index is not working. >> >> Is it because there is too much data that meets the conditions? >> >> >> EXPLAIN (ANALYZE ON, VERBOSE ON, COSTS ON, BUFFERS ON) >> >> select COUNT(ET_CD) >> >> from TBL_SHA >> >> WHERE TBL_SHA.MS_CD = 'MLD009' >> >> and TBL_SHA.ETRYS in >> >> (select TBL_INF.RY_CD >> >> from TBL_INF >> >> WHERE TBL_INF.MS_CD = 'MLD009' >> >> AND TBL_INF.RY_CD = '00000001' >> >> ) >> >> >> ----- Execution Plan ----- >> >> Limit (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.521..128717.677 rows=1 loops=1) >> >> Output: (count(tbl_sha.et_cd)) >> >> Buffers: shared hit=58948 read=2112758 >> >> I/O Timings: read=357249.120 >> >> -> Aggregate (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.519..128717.674 rows=1 loops=1) >> >> Output: count(tbl_sha.et_cd) >> >> Buffers: shared hit=58948 read=2112758 >> >> I/O Timings: read=357249.120 >> >> -> Nested Loop (cost=1000.29..2722556.76 rows=2430587 width=9) (actual time=2.364..128350.279 rows=2613500 loops=1) >> >> Output: tbl_sha.et_cd >> >> Buffers: shared hit=58948 read=2112758 >> >> I/O Timings: read=357249.120 >> >> -> Index Only Scan using tbl_inf_pkc on mtpdb.tbl_inf (cost=0.29..8.31 rows=1 width=9) (actual time=0.046..0.051 rows=1 loops=1) >> >> Output: tbl_inf.ms_cd, tbl_inf.ry_cd >> >> Index Cond: ((tbl_inf.ms_cd = 'MLD009'::bpchar) AND (tbl_inf.ry_cd = '00000001'::bpchar)) >> >> Heap Fetches: 1 >> >> Buffers: shared hit=4 >> >> -> Gather (cost=1000.00..2698242.58 rows=2430587 width=18) (actual time=2.315..127773.087 rows=2613500 loops=1) >> >> Output: tbl_sha.et_cd, tbl_sha.etrys >> >> Workers Planned: 2 >> >> Workers Launched: 2 >> >> Buffers: shared hit=58944 read=2112758 >> >> I/O Timings: read=357249.120 >> >> -> Parallel Seq Scan on mtpdb.tbl_sha (cost=0.00..2454183.88 rows=1012745 width=18) (actual time=952.728..127583.089 rows=871167 loops=3) >> >> Output: tbl_sha.et_cd, tbl_sha.etrys >> >> Filter: ((tbl_sha.ms_cd = 'MLD009'::bpchar) AND (tbl_sha.etrys = '00000001'::bpchar)) >> >> Rows Removed by Filter: 14062278 >> >> Buffers: shared hit=58944 read=2112758 >> >> I/O Timings: read=357249.120 >> >> Worker 0: actual time=1432.292..127762.181 rows=988036 loops=1 >> >> Buffers: shared hit=17875 read=706862 >> >> I/O Timings: read=119193.744 >> >> Worker 1: actual time=1425.878..127786.777 rows=992381 loops=1 >> >> Buffers: shared hit=19813 read=706359 >> >> I/O Timings: read=119386.899 >> >> Planning: >> >> Buffers: shared hit=42 >> >> Planning Time: 1.024 ms >> >> Execution Time: 128717.731 ms >> >> > >I think the whole query can just: >select COUNT(ET_CD) >from TBL_SHA >WHERE TBL_SHA.MS_CD = 'MLD009' AND TBL_SHA.ETRYS = '00000001'; > >> and TBL_SHA.ETRYS in >> (select TBL_INF.RY_CD >> from TBL_INF >> WHERE TBL_INF.MS_CD = 'MLD009' >> AND TBL_INF.RY_CD = '00000001' >> ) > >if subquery after IN clause part, no rows returned then the whole >query would return zero row. >if many duplicates rows returned, then there is no point of evaluate >something like {1 in (1,1,1,1,1,1,)}
pgsql-general by date: