Re: How to improve the performance of my SQL query? - Mailing list pgsql-general
From | gzh |
---|---|
Subject | Re: How to improve the performance of my SQL query? |
Date | |
Msg-id | 35cecf64.4a72.1899129b4b7.Coremail.gzhcoder@126.com Whole thread Raw |
In response to | Re: How to improve the performance of my SQL query? (David Rowley <dgrowleyml@gmail.com>) |
Responses |
Re: How to improve the performance of my SQL query?
Re: How to improve the performance of my SQL query? |
List | pgsql-general |
Thank you very much for taking the time to reply to my question.>You might want to check your description of the table definitions.>Going by the above EXPLAIN ANALYZE output, it very much does not look >like ms_cd is the primary key of TBL_SHA. If it is then it's very >weird that you have 3200000 rows for MS_CD = 'MLD009'. You have some >data corruption if that's the case. I suspect you've just not>accurately described the table definition, however.The primary key of the SHA table has six fields, and ms_cd is just one of them.I'm sorry, I didn't make that clear.
>Try executing the query after having done: > >SET enable_seqscan TO off; > >What plan does it use now? >>Is that plan faster or slower than the seq scan plan?There's improvement, but it's still quite slow. QUERY PLAN (enable_seqscan=on)Limit (cost=2693516.87..2693516.88 rows=1 width=8) (actual time=167089.822..167183.058 rows=1 loops=1) -> Aggregate (cost=2693516.87..2693516.88 rows=1 width=8) (actual time=167089.820..167183.056 rows=1 loops=1) -> Nested Loop (cost=1000.29..2688558.85 rows=1983209 width=9) (actual time=43544.753..166906.304 rows=2413500 loops=1) -> Index Only Scan using tbl_inf_pkc on tbl_inf (cost=0.29..8.31 rows=1 width=9) (actual time=1.034..1.038 rows=1 loops=1) Index Cond: ((ms_cd = 'MLD009'::bpchar) AND (ry_cd = '00000001'::bpchar)) Heap Fetches: 1 -> Gather (cost=1000.00..2668718.45 rows=1983209 width=18) (actual time=43543.714..166447.333 rows=2413500 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on tbl_sha (cost=0.00..2469397.55 rows=826337 width=18) (actual time=43537.056..166225.162 rows=804500 loops=3) Filter: ((ms_cd = 'MLD009'::bpchar) AND (etrys = '00000001'::bpchar)) Rows Removed by Filter: 15362328 Planning Time: 2.942 ms Execution Time: 167183.133 ms SET enable_seqscan TO off; QUERY PLAN (enable_seqscan=off) Limit (cost=2880973.06..2880973.07 rows=1 width=8) (actual time=22295.419..22320.102 rows=1 loops=1) -> Aggregate (cost=2880973.06..2880973.07 rows=1 width=8) (actual time=22295.418..22320.100 rows=1 loops=1) -> Nested Loop (cost=93112.74..2876169.16 rows=1921561 width=9) (actual time=265.880..22000.432 rows=2413500 loops=1) -> Index Only Scan using tbl_inf_pkc on tbl_inf (cost=0.29..8.31 rows=1 width=9) (actual time=0.013..0.020 rows=1 loops=1) Index Cond: ((ms_cd = 'MLD009'::bpchar) AND (ry_cd = '00000001'::bpchar)) Heap Fetches: 1 -> Gather (cost=93112.45..2856945.24 rows=1921561 width=18) (actual time=265.864..21535.325 rows=2413500 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Bitmap Heap Scan on tbl_sha (cost=92112.45..2663789.14 rows=800650 width=18) (actual time=260.540..21442.169 rows=804500 loops=3) Recheck Cond: (ms_cd = 'MLD009'::bpchar) Rows Removed by Index Recheck: 49 Filter: (etrys = '00000001'::bpchar) Rows Removed by Filter: 295500 Heap Blocks: exact=13788 lossy=10565 -> Bitmap Index Scan on index_search_04_mscd_cdate (cost=0.00..91632.06 rows=3402599 width=0) (actual time=249.718..249.718 rows=3300000 loops=1) Index Cond: (ms_cd = 'MLD009'::bpchar) Planning Time: 0.670 ms Execution Time: 22320.153 ms
At 2023-07-25 21:04:16, "David Rowley" <dgrowleyml@gmail.com> wrote: >On Fri, 21 Jul 2023 at 13:44, gzh <gzhcoder@126.com> wrote: >> >> The definitions of the columns used in SQL are as follows. >> >> TBL_SHA >> ms_cd character(6) NOT NULL -- PRIMARY KEY >> et_cd character(8) >> etrys character(8) > >> explain analyze >> select COUNT(ET_CD) >> from TBL_SHA >> WHERE TBL_SHA.MS_CD = 'MLD009' >> and TBL_SHA.ETRYS in >> (select TBL_INF.RY_CD >> from TBL_INF >> WHERE TBL_INF.MS_CD = 'MLD009' >> AND TBL_INF.RY_CD = '00000001' >> ) >> ----- Execution Plan ----- >> Limit (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.769..124168.771 rows=1 loops=1) >> -> Aggregate (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.767..124168.769 rows=1 loops=1) >> -> Nested Loop (cost=0.29..2730702.63 rows=3202774 width=9) (actual time=97264.166..123920.769 rows=3200000 loops=1) >> -> Index Only Scan using TBL_INF_pkc on TBL_INF (cost=0.29..8.31 rows=1 width=9) (actual time=0.025..0.030 rows=1 loops=1) >> Index Cond: ((MS_CD = 'MLD009'::bpchar) AND (RY_CD = '00000001'::bpchar)) >> Heap Fetches: 1 >> -> Seq Scan on TBL_SHA (cost=0.00..2698666.58 rows=3202774 width=18) (actual time=97264.138..123554.792 rows=3200000 loops=1) >> Filter: ((MS_CD = 'MLD009'::bpchar) AND (ETRYS = '00000001'::bpchar)) >> Rows Removed by Filter: 32000325 >> Planning Time: 0.162 ms >> Execution Time: 124168.838 ms > >You might want to check your description of the table definitions. >Going by the above EXPLAIN ANALYZE output, it very much does not look >like ms_cd is the primary key of TBL_SHA. If it is then it's very >weird that you have 3200000 rows for MS_CD = 'MLD009'. You have some >data corruption if that's the case. I suspect you've just not >accurately described the table definition, however. > >David
pgsql-general by date: