Home > mailing lists

Re: How to improve the performance of my SQL query？ - Mailing list pgsql-general

From	gzh
Subject	Re: How to improve the performance of my SQL query？
Date	July 24, 2023 09:54:19
Msg-id	66883f7d.6d50.189875236b2.Coremail.gzhcoder@126.com Whole thread Raw
In response to	Re: How to improve the performance of my SQL query？ (Laurenz Albe <laurenz.albe@cybertec.at>)
Responses	Re: How to improve the performance of my SQL query？ Re: How to improve the performance of my SQL query？
List	pgsql-general

Tree view

>Did you change any parameters that have an impact on query planning?

>You can see that in the output of EXPLAIN (ANALYZE, BUFFERS, SETTINGS).

I added some parameters and re-executed the Execution Plan.

Except for the index not taking effect, I still don't know the reason why the index is not working.

Is it because there is too much data that meets the conditions?

EXPLAIN (ANALYZE ON, VERBOSE ON, COSTS ON, BUFFERS ON)

select COUNT(ET_CD)

from TBL_SHA

WHERE TBL_SHA.MS_CD = 'MLD009'

and TBL_SHA.ETRYS in

(select TBL_INF.RY_CD

from TBL_INF

WHERE TBL_INF.MS_CD = 'MLD009'

AND TBL_INF.RY_CD = '00000001'

)

----- Execution Plan -----

Limit (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.521..128717.677 rows=1 loops=1)

Output: (count(tbl_sha.et_cd))

Buffers: shared hit=58948 read=2112758

I/O Timings: read=357249.120

-> Aggregate (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.519..128717.674 rows=1 loops=1)

Output: count(tbl_sha.et_cd)

Buffers: shared hit=58948 read=2112758

I/O Timings: read=357249.120

-> Nested Loop (cost=1000.29..2722556.76 rows=2430587 width=9) (actual time=2.364..128350.279 rows=2613500 loops=1)

Output: tbl_sha.et_cd

Buffers: shared hit=58948 read=2112758

I/O Timings: read=357249.120

-> Index Only Scan using tbl_inf_pkc on mtpdb.tbl_inf (cost=0.29..8.31 rows=1 width=9) (actual time=0.046..0.051 rows=1 loops=1)

Output: tbl_inf.ms_cd, tbl_inf.ry_cd

Index Cond: ((tbl_inf.ms_cd = 'MLD009'::bpchar) AND (tbl_inf.ry_cd = '00000001'::bpchar))

Heap Fetches: 1

Buffers: shared hit=4

-> Gather (cost=1000.00..2698242.58 rows=2430587 width=18) (actual time=2.315..127773.087 rows=2613500 loops=1)

Output: tbl_sha.et_cd, tbl_sha.etrys

Workers Planned: 2

Workers Launched: 2

Buffers: shared hit=58944 read=2112758

I/O Timings: read=357249.120

-> Parallel Seq Scan on mtpdb.tbl_sha (cost=0.00..2454183.88 rows=1012745 width=18) (actual time=952.728..127583.089 rows=871167 loops=3)

Output: tbl_sha.et_cd, tbl_sha.etrys

Filter: ((tbl_sha.ms_cd = 'MLD009'::bpchar) AND (tbl_sha.etrys = '00000001'::bpchar))

Rows Removed by Filter: 14062278

Buffers: shared hit=58944 read=2112758

I/O Timings: read=357249.120

Worker 0: actual time=1432.292..127762.181 rows=988036 loops=1

Buffers: shared hit=17875 read=706862

I/O Timings: read=119193.744

Worker 1: actual time=1425.878..127786.777 rows=992381 loops=1

Buffers: shared hit=19813 read=706359

I/O Timings: read=119386.899

Planning:

Buffers: shared hit=42

Planning Time: 1.024 ms

Execution Time: 128717.731 ms


At 2023-07-24 13:43:46, "Laurenz Albe" <laurenz.albe@cybertec.at> wrote:
>On Fri, 2023-07-21 at 09:43 +0800, gzh wrote:
>> The definitions of the columns used in SQL are as follows.
>> 
>> TBL_SHA
>> 
>> ms_cd character(6) NOT NULL       -- PRIMARY KEY
>> et_cd character(8)
>> etrys character(8)
>> 
>> TBL_INF
>> 
>> ms_cd character(6) NOT NULL       -- PRIMARY KEY
>> ry_cd character(8) NOT NULL       -- PRIMARY KEY
>> 
>> I made some modifications to the data, and I realized that I should not change the length of the data. 
>> The actual data and its corresponding execution plan are shown below.
>> 
>> explain analyze
>> select COUNT(ET_CD)
>> from TBL_SHA
>> WHERE TBL_SHA.MS_CD = 'MLD009'
>> and TBL_SHA.ETRYS in
>>    (select TBL_INF.RY_CD
>>     from TBL_INF
>>     WHERE TBL_INF.MS_CD = 'MLD009'
>>    AND TBL_INF.RY_CD = '00000001'
>>    )
>> ----- Execution Plan -----
>> Limit  (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.769..124168.771 rows=1 loops=1)
>>   ->  Aggregate  (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.767..124168.769 rows=1 loops=1)
>>         ->  Nested Loop  (cost=0.29..2730702.63 rows=3202774 width=9) (actual time=97264.166..123920.769 rows=3200000 loops=1)
>>               ->  Index Only Scan using TBL_INF_pkc on TBL_INF  (cost=0.29..8.31 rows=1 width=9) (actual time=0.025..0.030 rows=1 loops=1)
>>                     Index Cond: ((MS_CD = 'MLD009'::bpchar) AND (RY_CD = '00000001'::bpchar))
>>                     Heap Fetches: 1
>>               ->  Seq Scan on TBL_SHA  (cost=0.00..2698666.58 rows=3202774 width=18) (actual time=97264.138..123554.792 rows=3200000 loops=1)
>>                     Filter: ((MS_CD = 'MLD009'::bpchar) AND (ETRYS = '00000001'::bpchar))
>>                     Rows Removed by Filter: 32000325
>> Planning Time: 0.162 ms
>> Execution Time: 124168.838 ms
>
>Thanks.  That should definitely use a b-tree index defined on (ms_cd, etrsys).
>
>Did you change any parameters that have an impact on query planning?
>You can see that in the output of EXPLAIN (ANALYZE, BUFFERS, SETTINGS).
>
>Yours,
>Laurenz Albe

pgsql-general by date:

From: Antonio Čale
Date: 24 July 2023, 09:51:23
Subject: Re: \d don't print all the tables

From: Dominique Devienne
Date: 24 July 2023, 12:44:24
Subject: Re: aclitem binary encoding

Re: How to improve the performance of my SQL query？ - Mailing list pgsql-general

Previous

Next