Thread: Is there any good optimization solution to improve the query efficiency?

Hi everyone,

I'm running into some performance issues with my SQL query.
The following SQL query is taking a long time to execute.

Execution Plan:
explain analyse
select * from TBL_RES
left outer join(select T_CUST.RSNO RSNO2 ,
T_CUST.KNO ,
T_CUST.AGE ,
T_CUST.GST
from TBL_CUST T_CUST ,
(select T_CUST.RSNO ,
T_CUST.KNO ,
MIN(T_CUST.GSTSEQ) GSTSEQ
from TBL_CUST T_CUST ,
TBL_POV T_POV ,
TBL_RES T_RES
where T_CUST.STSFLG = 'T'
and T_CUST.DISPSEQ <> 9999
AND T_CUST.KFIX = '0'
and T_POV.CRSNO = T_RES.CRSNO
and T_RES.RSNO = T_CUST.RSNO
group by T_CUST.RSNO , T_CUST.KNO) T_POV2
where T_POV2.RSNO = T_CUST.RSNO
and T_POV2.KNO = T_CUST.KNO
and T_POV2.GSTSEQ = T_CUST.GSTSEQ) T_POV3 on TBL_RES.RSNO = T_POV3.RSNO2
and TBL_RES.KNO = T_POV3.KNO
where TBL_RES.CID >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.CID <= to_date('2022/07/31', 'YYYY/MM/DD')
and TBL_RES.COD >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.COD <= to_date('2022/07/31', 'YYYY/MM/DD')
----- Execution Plan -----
Nested Loop Left Join  (cost=254388.44..452544.70 rows=473 width=3545) (actual time=3077.312..996048.714 rows=15123 loops=1)
  Join Filter: ((TBL_RES.RSNO = T_CUST.RSNO) AND ((TBL_RES.KNO)::text = (T_CUST.KNO)::text))
  Rows Removed by Join Filter: 4992268642
  ->  Gather  (cost=1000.00..58424.35 rows=473 width=3489) (actual time=0.684..14.158 rows=15123 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        ->  Parallel Seq Scan on TBL_RES  (cost=0.00..57377.05 rows=197 width=3489) (actual time=0.096..279.504 rows=5041 loops=3)
              Filter: ((CID >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (CID <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)) AND (COD >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (COD <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)))
              Rows Removed by Filter: 161714
  ->  Materialize  (cost=253388.44..394112.08 rows=1 width=56) (actual time=0.081..26.426 rows=330111 loops=15123)
        ->  Hash Join  (cost=253388.44..394112.07 rows=1 width=56) (actual time=1197.484..2954.084 rows=330111 loops=1)
              Hash Cond: ((T_CUST.RSNO = T_CUST_1.RSNO) AND ((T_CUST.KNO)::text = (T_CUST_1.KNO)::text) AND (T_CUST.gstseq = (min(T_CUST_1.gstseq))))
              ->  Seq Scan on TBL_CUST T_CUST  (cost=0.00..79431.15 rows=2000315 width=61) (actual time=0.015..561.005 rows=2000752 loops=1)
              ->  Hash  (cost=246230.90..246230.90 rows=262488 width=50) (actual time=1197.025..1209.957 rows=330111 loops=1)
                    Buckets: 65536  Batches: 8  Memory Usage: 2773kB
                    ->  Finalize GroupAggregate  (cost=205244.84..243606.02 rows=262488 width=50) (actual time=788.552..1116.074 rows=330111 loops=1)
                          Group Key: T_CUST_1.RSNO, T_CUST_1.KNO
                          ->  Gather Merge  (cost=205244.84..238964.80 rows=268846 width=50) (actual time=788.547..982.479 rows=330111 loops=1)
                                Workers Planned: 2
                                Workers Launched: 1
                                ->  Partial GroupAggregate  (cost=204244.81..206933.27 rows=134423 width=50) (actual time=784.032..900.979 rows=165056 loops=2)
                                      Group Key: T_CUST_1.RSNO, T_CUST_1.KNO
                                      ->  Sort  (cost=204244.81..204580.87 rows=134423 width=23) (actual time=784.019..833.791 rows=165061 loops=2)
                                            Sort Key: T_CUST_1.RSNO, T_CUST_1.KNO
                                            Sort Method: external merge  Disk: 5480kB
                                            Worker 0:  Sort Method: external merge  Disk: 5520kB
                                            ->  Parallel Hash Join  (cost=111758.80..190036.38 rows=134423 width=23) (actual time=645.302..716.247 rows=165061 loops=2)
                                                  Hash Cond: (T_CUST_1.RSNO = T_RES.RSNO)
                                                  ->  Parallel Seq Scan on TBL_CUST T_CUST_1  (cost=0.00..74013.63 rows=204760 width=23) (actual time=0.018..264.390 rows=165058 loops=2)
                                                        Filter: ((dispseq <> '9999'::numeric) AND ((stsflg)::text = 'T'::text) AND ((KFIX)::text = '0'::text))
                                                        Rows Removed by Filter: 835318
                                                  ->  Parallel Hash  (cost=109508.52..109508.52 rows=137142 width=8) (actual time=343.593..343.896 rows=165058 loops=2)
                                                        Buckets: 131072  Batches: 8  Memory Usage: 3008kB
                                                        ->  Parallel Hash Join  (cost=51834.70..109508.52 rows=137142 width=8) (actual time=256.732..314.368 rows=165058 loops=2)
                                                              Hash Cond: ((T_RES.crsno)::text = (T_POV.crsno)::text)
                                                              ->  Parallel Seq Scan on TBL_RES T_RES  (cost=0.00..53199.02 rows=208902 width=17) (actual time=0.007..100.510 rows=250132 loops=2)
                                                              ->  Parallel Hash  (cost=49450.42..49450.42 rows=137142 width=9) (actual time=122.308..122.309 rows=165054 loops=2)
                                                                    Buckets: 131072  Batches: 8  Memory Usage: 2976kB
                                                                    ->  Parallel Seq Scan on TBL_POV T_POV  (cost=0.00..49450.42 rows=137142 width=9) (actual time=0.037..89.470 rows=165054 loops=2)
Planning Time: 1.064 ms
Execution Time: 996062.382 ms
--------------------------------------------------------------------------------

The amount of data in the table is as follows.
TBL_RES    500265
TBL_CUST   2000752
TBL_POV    330109

Any suggestions for improving the performance of the query would be greatly appreciated.

Thanks in advance!

On Mon, 5 Jun 2023 at 07:56, gzh <gzhcoder@126.com> wrote:

Hi everyone,

I'm running into some performance issues with my SQL query.
The following SQL query is taking a long time to execute.

Execution Plan:
explain analyse
select * from TBL_RES
left outer join(select T_CUST.RSNO RSNO2 ,
T_CUST.KNO ,
T_CUST.AGE ,
T_CUST.GST
from TBL_CUST T_CUST ,
(select T_CUST.RSNO ,
T_CUST.KNO ,
MIN(T_CUST.GSTSEQ) GSTSEQ
from TBL_CUST T_CUST ,
TBL_POV T_POV ,
TBL_RES T_RES
where T_CUST.STSFLG = 'T'
and T_CUST.DISPSEQ <> 9999
AND T_CUST.KFIX = '0'
and T_POV.CRSNO = T_RES.CRSNO
and T_RES.RSNO = T_CUST.RSNO
group by T_CUST.RSNO , T_CUST.KNO) T_POV2
where T_POV2.RSNO = T_CUST.RSNO
and T_POV2.KNO = T_CUST.KNO
and T_POV2.GSTSEQ = T_CUST.GSTSEQ) T_POV3 on TBL_RES.RSNO = T_POV3.RSNO2
and TBL_RES.KNO = T_POV3.KNO
where TBL_RES.CID >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.CID <= to_date('2022/07/31', 'YYYY/MM/DD')
and TBL_RES.COD >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.COD <= to_date('2022/07/31', 'YYYY/MM/DD')
----- Execution Plan -----
Nested Loop Left Join  (cost=254388.44..452544.70 rows=473 width=3545) (actual time=3077.312..996048.714 rows=15123 loops=1)
  Join Filter: ((TBL_RES.RSNO = T_CUST.RSNO) AND ((TBL_RES.KNO)::text = (T_CUST.KNO)::text))
  Rows Removed by Join Filter: 4992268642
  ->  Gather  (cost=1000.00..58424.35 rows=473 width=3489) (actual time=0.684..14.158 rows=15123 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        ->  Parallel Seq Scan on TBL_RES  (cost=0.00..57377.05 rows=197 width=3489) (actual time=0.096..279.504 rows=5041 loops=3)
              Filter: ((CID >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (CID <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)) AND (COD >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (COD <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)))
              Rows Removed by Filter: 161714
  ->  Materialize  (cost=253388.44..394112.08 rows=1 width=56) (actual time=0.081..26.426 rows=330111 loops=15123)
        ->  Hash Join  (cost=253388.44..394112.07 rows=1 width=56) (actual time=1197.484..2954.084 rows=330111 loops=1)
              Hash Cond: ((T_CUST.RSNO = T_CUST_1.RSNO) AND ((T_CUST.KNO)::text = (T_CUST_1.KNO)::text) AND (T_CUST.gstseq = (min(T_CUST_1.gstseq))))
              ->  Seq Scan on TBL_CUST T_CUST  (cost=0.00..79431.15 rows=2000315 width=61) (actual time=0.015..561.005 rows=2000752 loops=1)
              ->  Hash  (cost=246230.90..246230.90 rows=262488 width=50) (actual time=1197.025..1209.957 rows=330111 loops=1)
                    Buckets: 65536  Batches: 8  Memory Usage: 2773kB
                    ->  Finalize GroupAggregate  (cost=205244.84..243606.02 rows=262488 width=50) (actual time=788.552..1116.074 rows=330111 loops=1)
                          Group Key: T_CUST_1.RSNO, T_CUST_1.KNO
                          ->  Gather Merge  (cost=205244.84..238964.80 rows=268846 width=50) (actual time=788.547..982.479 rows=330111 loops=1)
                                Workers Planned: 2
                                Workers Launched: 1
                                ->  Partial GroupAggregate  (cost=204244.81..206933.27 rows=134423 width=50) (actual time=784.032..900.979 rows=165056 loops=2)
                                      Group Key: T_CUST_1.RSNO, T_CUST_1.KNO
                                      ->  Sort  (cost=204244.81..204580.87 rows=134423 width=23) (actual time=784.019..833.791 rows=165061 loops=2)
                                            Sort Key: T_CUST_1.RSNO, T_CUST_1.KNO
                                            Sort Method: external merge  Disk: 5480kB
                                            Worker 0:  Sort Method: external merge  Disk: 5520kB
                                            ->  Parallel Hash Join  (cost=111758.80..190036.38 rows=134423 width=23) (actual time=645.302..716.247 rows=165061 loops=2)
                                                  Hash Cond: (T_CUST_1.RSNO = T_RES.RSNO)
                                                  ->  Parallel Seq Scan on TBL_CUST T_CUST_1  (cost=0.00..74013.63 rows=204760 width=23) (actual time=0.018..264.390 rows=165058 loops=2)
                                                        Filter: ((dispseq <> '9999'::numeric) AND ((stsflg)::text = 'T'::text) AND ((KFIX)::text = '0'::text))
                                                        Rows Removed by Filter: 835318
                                                  ->  Parallel Hash  (cost=109508.52..109508.52 rows=137142 width=8) (actual time=343.593..343.896 rows=165058 loops=2)
                                                        Buckets: 131072  Batches: 8  Memory Usage: 3008kB
                                                        ->  Parallel Hash Join  (cost=51834.70..109508.52 rows=137142 width=8) (actual time=256.732..314.368 rows=165058 loops=2)
                                                              Hash Cond: ((T_RES.crsno)::text = (T_POV.crsno)::text)
                                                              ->  Parallel Seq Scan on TBL_RES T_RES  (cost=0.00..53199.02 rows=208902 width=17) (actual time=0.007..100.510 rows=250132 loops=2)
                                                              ->  Parallel Hash  (cost=49450.42..49450.42 rows=137142 width=9) (actual time=122.308..122.309 rows=165054 loops=2)
                                                                    Buckets: 131072  Batches: 8  Memory Usage: 2976kB
                                                                    ->  Parallel Seq Scan on TBL_POV T_POV  (cost=0.00..49450.42 rows=137142 width=9) (actual time=0.037..89.470 rows=165054 loops=2)
Planning Time: 1.064 ms
Execution Time: 996062.382 ms
--------------------------------------------------------------------------------

The amount of data in the table is as follows.
TBL_RES    500265
TBL_CUST   2000752
TBL_POV    330109

Any suggestions for improving the performance of the query would be greatly appreciated.

Thanks in advance!


Welcome to query optimisation, I hope you'll enjoy working on this problem, it's one of the things I and many others love.

From a quick glance, it looks like there are opportunities for index use there but the plan doesn't contain any. Do any indexes exist?

You could also start looking at it in parts, e.g. separate out the inner T_POV2 subselect. It looks like that contains a cross join, which is then whittled down with WHERE clauses. You could try adding ON to the FROM instead. https://www.postgresql.org/docs/15/sql-select.html#SQL-FROM

Oliver
Indices on TBL_RES.CID, TBL_RES.COD and the "join columns"?
Have you vacuumed and analyzed the tables lately?
Oliver's comment about first optimizing the individual subselects is also SOP.


On 6/5/23 01:56, gzh wrote:

Hi everyone,

I'm running into some performance issues with my SQL query.
The following SQL query is taking a long time to execute.

Execution Plan:
explain analyse
select * from TBL_RES
left outer join(select T_CUST.RSNO RSNO2 ,
T_CUST.KNO ,
T_CUST.AGE ,
T_CUST.GST
from TBL_CUST T_CUST ,
(select T_CUST.RSNO ,
T_CUST.KNO ,
MIN(T_CUST.GSTSEQ) GSTSEQ
from TBL_CUST T_CUST ,
TBL_POV T_POV ,
TBL_RES T_RES
where T_CUST.STSFLG = 'T'
and T_CUST.DISPSEQ <> 9999
AND T_CUST.KFIX = '0'
and T_POV.CRSNO = T_RES.CRSNO
and T_RES.RSNO = T_CUST.RSNO
group by T_CUST.RSNO , T_CUST.KNO) T_POV2
where T_POV2.RSNO = T_CUST.RSNO
and T_POV2.KNO = T_CUST.KNO
and T_POV2.GSTSEQ = T_CUST.GSTSEQ) T_POV3 on TBL_RES.RSNO = T_POV3.RSNO2
and TBL_RES.KNO = T_POV3.KNO
where TBL_RES.CID >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.CID <= to_date('2022/07/31', 'YYYY/MM/DD')
and TBL_RES.COD >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.COD <= to_date('2022/07/31', 'YYYY/MM/DD')
----- Execution Plan -----
Nested Loop Left Join  (cost=254388.44..452544.70 rows=473 width=3545) (actual time=3077.312..996048.714 rows=15123 loops=1)
  Join Filter: ((TBL_RES.RSNO = T_CUST.RSNO) AND ((TBL_RES.KNO)::text = (T_CUST.KNO)::text))
  Rows Removed by Join Filter: 4992268642
  ->  Gather  (cost=1000.00..58424.35 rows=473 width=3489) (actual time=0.684..14.158 rows=15123 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        ->  Parallel Seq Scan on TBL_RES  (cost=0.00..57377.05 rows=197 width=3489) (actual time=0.096..279.504 rows=5041 loops=3)
              Filter: ((CID >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (CID <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)) AND (COD >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (COD <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)))
              Rows Removed by Filter: 161714
  ->  Materialize  (cost=253388.44..394112.08 rows=1 width=56) (actual time=0.081..26.426 rows=330111 loops=15123)
        ->  Hash Join  (cost=253388.44..394112.07 rows=1 width=56) (actual time=1197.484..2954.084 rows=330111 loops=1)
              Hash Cond: ((T_CUST.RSNO = T_CUST_1.RSNO) AND ((T_CUST.KNO)::text = (T_CUST_1.KNO)::text) AND (T_CUST.gstseq = (min(T_CUST_1.gstseq))))
              ->  Seq Scan on TBL_CUST T_CUST  (cost=0.00..79431.15 rows=2000315 width=61) (actual time=0.015..561.005 rows=2000752 loops=1)
              ->  Hash  (cost=246230.90..246230.90 rows=262488 width=50) (actual time=1197.025..1209.957 rows=330111 loops=1)
                    Buckets: 65536  Batches: 8  Memory Usage: 2773kB
                    ->  Finalize GroupAggregate  (cost=205244.84..243606.02 rows=262488 width=50) (actual time=788.552..1116.074 rows=330111 loops=1)
                          Group Key: T_CUST_1.RSNO, T_CUST_1.KNO
                          ->  Gather Merge  (cost=205244.84..238964.80 rows=268846 width=50) (actual time=788.547..982.479 rows=330111 loops=1)
                                Workers Planned: 2
                                Workers Launched: 1
                                ->  Partial GroupAggregate  (cost=204244.81..206933.27 rows=134423 width=50) (actual time=784.032..900.979 rows=165056 loops=2)
                                      Group Key: T_CUST_1.RSNO, T_CUST_1.KNO
                                      ->  Sort  (cost=204244.81..204580.87 rows=134423 width=23) (actual time=784.019..833.791 rows=165061 loops=2)
                                            Sort Key: T_CUST_1.RSNO, T_CUST_1.KNO
                                            Sort Method: external merge  Disk: 5480kB
                                            Worker 0:  Sort Method: external merge  Disk: 5520kB
                                            ->  Parallel Hash Join  (cost=111758.80..190036.38 rows=134423 width=23) (actual time=645.302..716.247 rows=165061 loops=2)
                                                  Hash Cond: (T_CUST_1.RSNO = T_RES.RSNO)
                                                  ->  Parallel Seq Scan on TBL_CUST T_CUST_1  (cost=0.00..74013.63 rows=204760 width=23) (actual time=0.018..264.390 rows=165058 loops=2)
                                                        Filter: ((dispseq <> '9999'::numeric) AND ((stsflg)::text = 'T'::text) AND ((KFIX)::text = '0'::text))
                                                        Rows Removed by Filter: 835318
                                                  ->  Parallel Hash  (cost=109508.52..109508.52 rows=137142 width=8) (actual time=343.593..343.896 rows=165058 loops=2)
                                                        Buckets: 131072  Batches: 8  Memory Usage: 3008kB
                                                        ->  Parallel Hash Join  (cost=51834.70..109508.52 rows=137142 width=8) (actual time=256.732..314.368 rows=165058 loops=2)
                                                              Hash Cond: ((T_RES.crsno)::text = (T_POV.crsno)::text)
                                                              ->  Parallel Seq Scan on TBL_RES T_RES  (cost=0.00..53199.02 rows=208902 width=17) (actual time=0.007..100.510 rows=250132 loops=2)
                                                              ->  Parallel Hash  (cost=49450.42..49450.42 rows=137142 width=9) (actual time=122.308..122.309 rows=165054 loops=2)
                                                                    Buckets: 131072  Batches: 8  Memory Usage: 2976kB
                                                                    ->  Parallel Seq Scan on TBL_POV T_POV  (cost=0.00..49450.42 rows=137142 width=9) (actual time=0.037..89.470 rows=165054 loops=2)
Planning Time: 1.064 ms
Execution Time: 996062.382 ms
--------------------------------------------------------------------------------

The amount of data in the table is as follows.
TBL_RES    500265
TBL_CUST   2000752
TBL_POV    330109

Any suggestions for improving the performance of the query would be greatly appreciated.

Thanks in advance!


--
Born in Arizona, moved to Babylonia.
On Mon, 5 Jun 2023 at 18:56, gzh <gzhcoder@126.com> wrote:
> I'm running into some performance issues with my SQL query.
> The following SQL query is taking a long time to execute.

>         ->  Hash Join  (cost=253388.44..394112.07 rows=1 width=56) (actual time=1197.484..2954.084 rows=330111
loops=1)
>               Hash Cond: ((T_CUST.RSNO = T_CUST_1.RSNO) AND ((T_CUST.KNO)::text = (T_CUST_1.KNO)::text) AND
(T_CUST.gstseq= (min(T_CUST_1.gstseq))))
 
>               ->  Seq Scan on TBL_CUST T_CUST  (cost=0.00..79431.15 rows=2000315 width=61) (actual
time=0.015..561.005rows=2000752 loops=1)
 
>               ->  Hash  (cost=246230.90..246230.90 rows=262488 width=50) (actual time=1197.025..1209.957 rows=330111
loops=1)

The above join's selectivity estimation seems to be causing an upper
join to resort to performing a Nested Loop join because the planner
thinks the join will only produce 1 row.

Unfortunately, extended statistics only help for base relation
estimations and do nothing for join estimations, so your best bet
might be to just:

SET enable_nestloop TO off;

for this query.

David



Hi, David


>The above join's selectivity estimation seems to be causing an upper

>join to resort to performing a Nested Loop join because the planner

>thinks the join will only produce 1 row.

>

>Unfortunately, extended statistics only help for base relation

>estimations and do nothing for join estimations, so your best bet

>might be to just:

>

>SET enable_nestloop TO off;

>

>for this query.

After making the adjustments as you suggested, 

the SQL statement that previously took 16 minutes to query results can now be queried in less than 10 seconds. 


Thank you very much for taking the time to reply to my question and providing a solution that solved my issue. 

Your expertise and willingness to help are greatly appreciated, and I learned a lot from your answer. 

Thanks again!







At 2023-06-05 16:21:19, "David Rowley" <dgrowleyml@gmail.com> wrote: >On Mon, 5 Jun 2023 at 18:56, gzh <gzhcoder@126.com> wrote: >> I'm running into some performance issues with my SQL query. >> The following SQL query is taking a long time to execute. > >> -> Hash Join (cost=253388.44..394112.07 rows=1 width=56) (actual time=1197.484..2954.084 rows=330111 loops=1) >> Hash Cond: ((T_CUST.RSNO = T_CUST_1.RSNO) AND ((T_CUST.KNO)::text = (T_CUST_1.KNO)::text) AND (T_CUST.gstseq = (min(T_CUST_1.gstseq)))) >> -> Seq Scan on TBL_CUST T_CUST (cost=0.00..79431.15 rows=2000315 width=61) (actual time=0.015..561.005 rows=2000752 loops=1) >> -> Hash (cost=246230.90..246230.90 rows=262488 width=50) (actual time=1197.025..1209.957 rows=330111 loops=1) > >The above join's selectivity estimation seems to be causing an upper >join to resort to performing a Nested Loop join because the planner >thinks the join will only produce 1 row. > >Unfortunately, extended statistics only help for base relation >estimations and do nothing for join estimations, so your best bet >might be to just: > >SET enable_nestloop TO off; > >for this query. > >David

Re: Is there any good optimization solution to improve the query efficiency?

From
Lorusso Domenico
Date:
Hello,
In many case a formal writing and usage of with statement could solve the issue.
If you need join, use always join:
where T_POV2.RSNO = T_CUST.RSNO
and T_POV2.KNO = T_CUST.KNO
and T_POV2.GSTSEQ = T_CUST.GSTSEQ)
this is an inner join.

I mean something like this
with t_pov2 as (
select T_CUST.RSNO ,
T_CUST.KNO ,
MIN(T_CUST.GSTSEQ) GSTSEQ
from TBL_CUST T_CUST ,
TBL_POV T_POV ,
TBL_RES T_RES
where T_CUST.STSFLG = 'T'
and T_CUST.DISPSEQ <> 9999
AND T_CUST.KFIX = '0'
and T_POV.CRSNO = T_RES.CRSNO
and T_RES.RSNO = T_CUST.RSNO
group by T_CUST.RSNO , T_CUST.KNO
), t_pov3 as (
select T_CUST.RSNO RSNO2 ,
T_CUST.KNO ,
T_CUST.AGE ,
T_CUST.GST
from TBL_CUST T_CUST
inner join t_pov2 on T_POV2.RSNO = T_CUST.RSNO
and T_POV2.KNO = T_CUST.KNO
and T_POV2.GSTSEQ = T_CUST.GSTSEQ
)

select *
from TBL_RES
left outer join t_pov3 on TBL_RES.RSNO = T_POV3.RSNO2
and TBL_RES.KNO = T_POV3.KNO
where TBL_RES.CID >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.CID <= to_date('2022/07/31', 'YYYY/MM/DD')
and TBL_RES.COD >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.COD <= to_date('2022/07/31', 'YYYY/MM/DD')

but if tbl_res contain lessere record a good idea is start from this table and use in join with other


Il giorno lun 5 giu 2023 alle ore 08:57 gzh <gzhcoder@126.com> ha scritto:

Hi everyone,

I'm running into some performance issues with my SQL query.
The following SQL query is taking a long time to execute.

Execution Plan:
explain analyse
select * from TBL_RES
left outer join(select T_CUST.RSNO RSNO2 ,
T_CUST.KNO ,
T_CUST.AGE ,
T_CUST.GST
from TBL_CUST T_CUST ,
(select T_CUST.RSNO ,
T_CUST.KNO ,
MIN(T_CUST.GSTSEQ) GSTSEQ
from TBL_CUST T_CUST ,
TBL_POV T_POV ,
TBL_RES T_RES
where T_CUST.STSFLG = 'T'
and T_CUST.DISPSEQ <> 9999
AND T_CUST.KFIX = '0'
and T_POV.CRSNO = T_RES.CRSNO
and T_RES.RSNO = T_CUST.RSNO
group by T_CUST.RSNO , T_CUST.KNO) T_POV2
where T_POV2.RSNO = T_CUST.RSNO
and T_POV2.KNO = T_CUST.KNO
and T_POV2.GSTSEQ = T_CUST.GSTSEQ) T_POV3 on TBL_RES.RSNO = T_POV3.RSNO2
and TBL_RES.KNO = T_POV3.KNO
where TBL_RES.CID >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.CID <= to_date('2022/07/31', 'YYYY/MM/DD')
and TBL_RES.COD >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.COD <= to_date('2022/07/31', 'YYYY/MM/DD')
----- Execution Plan -----
Nested Loop Left Join  (cost=254388.44..452544.70 rows=473 width=3545) (actual time=3077.312..996048.714 rows=15123 loops=1)
  Join Filter: ((TBL_RES.RSNO = T_CUST.RSNO) AND ((TBL_RES.KNO)::text = (T_CUST.KNO)::text))
  Rows Removed by Join Filter: 4992268642
  ->  Gather  (cost=1000.00..58424.35 rows=473 width=3489) (actual time=0.684..14.158 rows=15123 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        ->  Parallel Seq Scan on TBL_RES  (cost=0.00..57377.05 rows=197 width=3489) (actual time=0.096..279.504 rows=5041 loops=3)
              Filter: ((CID >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (CID <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)) AND (COD >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (COD <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)))
              Rows Removed by Filter: 161714
  ->  Materialize  (cost=253388.44..394112.08 rows=1 width=56) (actual time=0.081..26.426 rows=330111 loops=15123)
        ->  Hash Join  (cost=253388.44..394112.07 rows=1 width=56) (actual time=1197.484..2954.084 rows=330111 loops=1)
              Hash Cond: ((T_CUST.RSNO = T_CUST_1.RSNO) AND ((T_CUST.KNO)::text = (T_CUST_1.KNO)::text) AND (T_CUST.gstseq = (min(T_CUST_1.gstseq))))
              ->  Seq Scan on TBL_CUST T_CUST  (cost=0.00..79431.15 rows=2000315 width=61) (actual time=0.015..561.005 rows=2000752 loops=1)
              ->  Hash  (cost=246230.90..246230.90 rows=262488 width=50) (actual time=1197.025..1209.957 rows=330111 loops=1)
                    Buckets: 65536  Batches: 8  Memory Usage: 2773kB
                    ->  Finalize GroupAggregate  (cost=205244.84..243606.02 rows=262488 width=50) (actual time=788.552..1116.074 rows=330111 loops=1)
                          Group Key: T_CUST_1.RSNO, T_CUST_1.KNO
                          ->  Gather Merge  (cost=205244.84..238964.80 rows=268846 width=50) (actual time=788.547..982.479 rows=330111 loops=1)
                                Workers Planned: 2
                                Workers Launched: 1
                                ->  Partial GroupAggregate  (cost=204244.81..206933.27 rows=134423 width=50) (actual time=784.032..900.979 rows=165056 loops=2)
                                      Group Key: T_CUST_1.RSNO, T_CUST_1.KNO
                                      ->  Sort  (cost=204244.81..204580.87 rows=134423 width=23) (actual time=784.019..833.791 rows=165061 loops=2)
                                            Sort Key: T_CUST_1.RSNO, T_CUST_1.KNO
                                            Sort Method: external merge  Disk: 5480kB
                                            Worker 0:  Sort Method: external merge  Disk: 5520kB
                                            ->  Parallel Hash Join  (cost=111758.80..190036.38 rows=134423 width=23) (actual time=645.302..716.247 rows=165061 loops=2)
                                                  Hash Cond: (T_CUST_1.RSNO = T_RES.RSNO)
                                                  ->  Parallel Seq Scan on TBL_CUST T_CUST_1  (cost=0.00..74013.63 rows=204760 width=23) (actual time=0.018..264.390 rows=165058 loops=2)
                                                        Filter: ((dispseq <> '9999'::numeric) AND ((stsflg)::text = 'T'::text) AND ((KFIX)::text = '0'::text))
                                                        Rows Removed by Filter: 835318
                                                  ->  Parallel Hash  (cost=109508.52..109508.52 rows=137142 width=8) (actual time=343.593..343.896 rows=165058 loops=2)
                                                        Buckets: 131072  Batches: 8  Memory Usage: 3008kB
                                                        ->  Parallel Hash Join  (cost=51834.70..109508.52 rows=137142 width=8) (actual time=256.732..314.368 rows=165058 loops=2)
                                                              Hash Cond: ((T_RES.crsno)::text = (T_POV.crsno)::text)
                                                              ->  Parallel Seq Scan on TBL_RES T_RES  (cost=0.00..53199.02 rows=208902 width=17) (actual time=0.007..100.510 rows=250132 loops=2)
                                                              ->  Parallel Hash  (cost=49450.42..49450.42 rows=137142 width=9) (actual time=122.308..122.309 rows=165054 loops=2)
                                                                    Buckets: 131072  Batches: 8  Memory Usage: 2976kB
                                                                    ->  Parallel Seq Scan on TBL_POV T_POV  (cost=0.00..49450.42 rows=137142 width=9) (actual time=0.037..89.470 rows=165054 loops=2)
Planning Time: 1.064 ms
Execution Time: 996062.382 ms
--------------------------------------------------------------------------------

The amount of data in the table is as follows.
TBL_RES    500265
TBL_CUST   2000752
TBL_POV    330109

Any suggestions for improving the performance of the query would be greatly appreciated.

Thanks in advance!



--
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]

Thank you very much for taking the time to reply to my question. 

I followed your suggestion and rewrote the SQL using Common Table Expression (CTE). 

Unfortunately, there was no significant improvement in performance.



At 2023-06-05 17:47:25, "Lorusso Domenico" <domenico.l76@gmail.com> wrote:

Hello,
In many case a formal writing and usage of with statement could solve the issue.
If you need join, use always join:
where T_POV2.RSNO = T_CUST.RSNO
and T_POV2.KNO = T_CUST.KNO
and T_POV2.GSTSEQ = T_CUST.GSTSEQ)
this is an inner join.

I mean something like this
with t_pov2 as (
select T_CUST.RSNO ,
T_CUST.KNO ,
MIN(T_CUST.GSTSEQ) GSTSEQ
from TBL_CUST T_CUST ,
TBL_POV T_POV ,
TBL_RES T_RES
where T_CUST.STSFLG = 'T'
and T_CUST.DISPSEQ <> 9999
AND T_CUST.KFIX = '0'
and T_POV.CRSNO = T_RES.CRSNO
and T_RES.RSNO = T_CUST.RSNO
group by T_CUST.RSNO , T_CUST.KNO
), t_pov3 as (
select T_CUST.RSNO RSNO2 ,
T_CUST.KNO ,
T_CUST.AGE ,
T_CUST.GST
from TBL_CUST T_CUST
inner join t_pov2 on T_POV2.RSNO = T_CUST.RSNO
and T_POV2.KNO = T_CUST.KNO
and T_POV2.GSTSEQ = T_CUST.GSTSEQ
)

select *
from TBL_RES
left outer join t_pov3 on TBL_RES.RSNO = T_POV3.RSNO2
and TBL_RES.KNO = T_POV3.KNO
where TBL_RES.CID >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.CID <= to_date('2022/07/31', 'YYYY/MM/DD')
and TBL_RES.COD >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.COD <= to_date('2022/07/31', 'YYYY/MM/DD')

but if tbl_res contain lessere record a good idea is start from this table and use in join with other


Il giorno lun 5 giu 2023 alle ore 08:57 gzh <gzhcoder@126.com> ha scritto:

Hi everyone,

I'm running into some performance issues with my SQL query.
The following SQL query is taking a long time to execute.

Execution Plan:
explain analyse
select * from TBL_RES
left outer join(select T_CUST.RSNO RSNO2 ,
T_CUST.KNO ,
T_CUST.AGE ,
T_CUST.GST
from TBL_CUST T_CUST ,
(select T_CUST.RSNO ,
T_CUST.KNO ,
MIN(T_CUST.GSTSEQ) GSTSEQ
from TBL_CUST T_CUST ,
TBL_POV T_POV ,
TBL_RES T_RES
where T_CUST.STSFLG = 'T'
and T_CUST.DISPSEQ <> 9999
AND T_CUST.KFIX = '0'
and T_POV.CRSNO = T_RES.CRSNO
and T_RES.RSNO = T_CUST.RSNO
group by T_CUST.RSNO , T_CUST.KNO) T_POV2
where T_POV2.RSNO = T_CUST.RSNO
and T_POV2.KNO = T_CUST.KNO
and T_POV2.GSTSEQ = T_CUST.GSTSEQ) T_POV3 on TBL_RES.RSNO = T_POV3.RSNO2
and TBL_RES.KNO = T_POV3.KNO
where TBL_RES.CID >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.CID <= to_date('2022/07/31', 'YYYY/MM/DD')
and TBL_RES.COD >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.COD <= to_date('2022/07/31', 'YYYY/MM/DD')
----- Execution Plan -----
Nested Loop Left Join  (cost=254388.44..452544.70 rows=473 width=3545) (actual time=3077.312..996048.714 rows=15123 loops=1)
  Join Filter: ((TBL_RES.RSNO = T_CUST.RSNO) AND ((TBL_RES.KNO)::text = (T_CUST.KNO)::text))
  Rows Removed by Join Filter: 4992268642
  ->  Gather  (cost=1000.00..58424.35 rows=473 width=3489) (actual time=0.684..14.158 rows=15123 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        ->  Parallel Seq Scan on TBL_RES  (cost=0.00..57377.05 rows=197 width=3489) (actual time=0.096..279.504 rows=5041 loops=3)
              Filter: ((CID >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (CID <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)) AND (COD >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (COD <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)))
              Rows Removed by Filter: 161714
  ->  Materialize  (cost=253388.44..394112.08 rows=1 width=56) (actual time=0.081..26.426 rows=330111 loops=15123)
        ->  Hash Join  (cost=253388.44..394112.07 rows=1 width=56) (actual time=1197.484..2954.084 rows=330111 loops=1)
              Hash Cond: ((T_CUST.RSNO = T_CUST_1.RSNO) AND ((T_CUST.KNO)::text = (T_CUST_1.KNO)::text) AND (T_CUST.gstseq = (min(T_CUST_1.gstseq))))
              ->  Seq Scan on TBL_CUST T_CUST  (cost=0.00..79431.15 rows=2000315 width=61) (actual time=0.015..561.005 rows=2000752 loops=1)
              ->  Hash  (cost=246230.90..246230.90 rows=262488 width=50) (actual time=1197.025..1209.957 rows=330111 loops=1)
                    Buckets: 65536  Batches: 8  Memory Usage: 2773kB
                    ->  Finalize GroupAggregate  (cost=205244.84..243606.02 rows=262488 width=50) (actual time=788.552..1116.074 rows=330111 loops=1)
                          Group Key: T_CUST_1.RSNO, T_CUST_1.KNO
                          ->  Gather Merge  (cost=205244.84..238964.80 rows=268846 width=50) (actual time=788.547..982.479 rows=330111 loops=1)
                                Workers Planned: 2
                                Workers Launched: 1
                                ->  Partial GroupAggregate  (cost=204244.81..206933.27 rows=134423 width=50) (actual time=784.032..900.979 rows=165056 loops=2)
                                      Group Key: T_CUST_1.RSNO, T_CUST_1.KNO
                                      ->  Sort  (cost=204244.81..204580.87 rows=134423 width=23) (actual time=784.019..833.791 rows=165061 loops=2)
                                            Sort Key: T_CUST_1.RSNO, T_CUST_1.KNO
                                            Sort Method: external merge  Disk: 5480kB
                                            Worker 0:  Sort Method: external merge  Disk: 5520kB
                                            ->  Parallel Hash Join  (cost=111758.80..190036.38 rows=134423 width=23) (actual time=645.302..716.247 rows=165061 loops=2)
                                                  Hash Cond: (T_CUST_1.RSNO = T_RES.RSNO)
                                                  ->  Parallel Seq Scan on TBL_CUST T_CUST_1  (cost=0.00..74013.63 rows=204760 width=23) (actual time=0.018..264.390 rows=165058 loops=2)
                                                        Filter: ((dispseq <> '9999'::numeric) AND ((stsflg)::text = 'T'::text) AND ((KFIX)::text = '0'::text))
                                                        Rows Removed by Filter: 835318
                                                  ->  Parallel Hash  (cost=109508.52..109508.52 rows=137142 width=8) (actual time=343.593..343.896 rows=165058 loops=2)
                                                        Buckets: 131072  Batches: 8  Memory Usage: 3008kB
                                                        ->  Parallel Hash Join  (cost=51834.70..109508.52 rows=137142 width=8) (actual time=256.732..314.368 rows=165058 loops=2)
                                                              Hash Cond: ((T_RES.crsno)::text = (T_POV.crsno)::text)
                                                              ->  Parallel Seq Scan on TBL_RES T_RES  (cost=0.00..53199.02 rows=208902 width=17) (actual time=0.007..100.510 rows=250132 loops=2)
                                                              ->  Parallel Hash  (cost=49450.42..49450.42 rows=137142 width=9) (actual time=122.308..122.309 rows=165054 loops=2)
                                                                    Buckets: 131072  Batches: 8  Memory Usage: 2976kB
                                                                    ->  Parallel Seq Scan on TBL_POV T_POV  (cost=0.00..49450.42 rows=137142 width=9) (actual time=0.037..89.470 rows=165054 loops=2)
Planning Time: 1.064 ms
Execution Time: 996062.382 ms
--------------------------------------------------------------------------------

The amount of data in the table is as follows.
TBL_RES    500265
TBL_CUST   2000752
TBL_POV    330109

Any suggestions for improving the performance of the query would be greatly appreciated.

Thanks in advance!



--
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]

Re: Is there any good optimization solution to improve the query efficiency?

From
Lorusso Domenico
Date:
try this (there is some comment)

with t_res as (
select RSNO,  KNO
from TBL_RES
where TBL_RES.CID >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.CID <= to_date('2022/07/31', 'YYYY/MM/DD')
and TBL_RES.COD >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.COD <= to_date('2022/07/31', 'YYYY/MM/DD')
), t_pov2 as (
select T_CUST.RSNO ,
T_CUST.KNO ,
MIN(T_CUST.GSTSEQ) GSTSEQ
from T_RES -- this is tbl_res already filter by date
inner join TBL_CUST T_CUST on T_RES.RSNO = T_CUST.RSNO
inner join TBL_POV T_POV on T_POV.CRSNO = T_RES.CRSNO -- why you use this table? it doesn't seem to be used to extract data. Are you trying to extract data from T_RES that have at least a record in T_POV? in this case could work better move this join in the first with (using distinct or group by to ensure there will be just a record for rsno and kno)
where T_CUST.STSFLG = 'T'
and T_CUST.DISPSEQ <> 9999
AND T_CUST.KFIX = '0'
group by T_CUST.RSNO , T_CUST.KNO
), t_pov3 as (
select T_CUST.RSNO RSNO2 ,
T_CUST.KNO ,
T_CUST.AGE ,
T_CUST.GST
from TBL_CUST T_CUST
inner join t_pov2 on T_POV2.RSNO = T_CUST.RSNO
and T_POV2.KNO = T_CUST.KNO
and T_POV2.GSTSEQ = T_CUST.GSTSEQ
)
select *
from TBL_RES
left outer join t_pov3 on TBL_RES.RSNO = T_POV3.RSNO2
and TBL_RES.KNO = T_POV3.KNO       

Il giorno lun 5 giu 2023 alle ore 12:06 gzh <gzhcoder@126.com> ha scritto:

Thank you very much for taking the time to reply to my question. 

I followed your suggestion and rewrote the SQL using Common Table Expression (CTE). 

Unfortunately, there was no significant improvement in performance.



At 2023-06-05 17:47:25, "Lorusso Domenico" <domenico.l76@gmail.com> wrote:

Hello,
In many case a formal writing and usage of with statement could solve the issue.
If you need join, use always join:
where T_POV2.RSNO = T_CUST.RSNO
and T_POV2.KNO = T_CUST.KNO
and T_POV2.GSTSEQ = T_CUST.GSTSEQ)
this is an inner join.

I mean something like this
with t_pov2 as (
select T_CUST.RSNO ,
T_CUST.KNO ,
MIN(T_CUST.GSTSEQ) GSTSEQ
from TBL_CUST T_CUST ,
TBL_POV T_POV ,
TBL_RES T_RES
where T_CUST.STSFLG = 'T'
and T_CUST.DISPSEQ <> 9999
AND T_CUST.KFIX = '0'
and T_POV.CRSNO = T_RES.CRSNO
and T_RES.RSNO = T_CUST.RSNO
group by T_CUST.RSNO , T_CUST.KNO
), t_pov3 as (
select T_CUST.RSNO RSNO2 ,
T_CUST.KNO ,
T_CUST.AGE ,
T_CUST.GST
from TBL_CUST T_CUST
inner join t_pov2 on T_POV2.RSNO = T_CUST.RSNO
and T_POV2.KNO = T_CUST.KNO
and T_POV2.GSTSEQ = T_CUST.GSTSEQ
)

select *
from TBL_RES
left outer join t_pov3 on TBL_RES.RSNO = T_POV3.RSNO2
and TBL_RES.KNO = T_POV3.KNO
where TBL_RES.CID >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.CID <= to_date('2022/07/31', 'YYYY/MM/DD')
and TBL_RES.COD >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.COD <= to_date('2022/07/31', 'YYYY/MM/DD')

but if tbl_res contain lessere record a good idea is start from this table and use in join with other


Il giorno lun 5 giu 2023 alle ore 08:57 gzh <gzhcoder@126.com> ha scritto:

Hi everyone,

I'm running into some performance issues with my SQL query.
The following SQL query is taking a long time to execute.

Execution Plan:
explain analyse
select * from TBL_RES
left outer join(select T_CUST.RSNO RSNO2 ,
T_CUST.KNO ,
T_CUST.AGE ,
T_CUST.GST
from TBL_CUST T_CUST ,
(select T_CUST.RSNO ,
T_CUST.KNO ,
MIN(T_CUST.GSTSEQ) GSTSEQ
from TBL_CUST T_CUST ,
TBL_POV T_POV ,
TBL_RES T_RES
where T_CUST.STSFLG = 'T'
and T_CUST.DISPSEQ <> 9999
AND T_CUST.KFIX = '0'
and T_POV.CRSNO = T_RES.CRSNO
and T_RES.RSNO = T_CUST.RSNO
group by T_CUST.RSNO , T_CUST.KNO) T_POV2
where T_POV2.RSNO = T_CUST.RSNO
and T_POV2.KNO = T_CUST.KNO
and T_POV2.GSTSEQ = T_CUST.GSTSEQ) T_POV3 on TBL_RES.RSNO = T_POV3.RSNO2
and TBL_RES.KNO = T_POV3.KNO
where TBL_RES.CID >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.CID <= to_date('2022/07/31', 'YYYY/MM/DD')
and TBL_RES.COD >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.COD <= to_date('2022/07/31', 'YYYY/MM/DD')
----- Execution Plan -----
Nested Loop Left Join  (cost=254388.44..452544.70 rows=473 width=3545) (actual time=3077.312..996048.714 rows=15123 loops=1)
  Join Filter: ((TBL_RES.RSNO = T_CUST.RSNO) AND ((TBL_RES.KNO)::text = (T_CUST.KNO)::text))
  Rows Removed by Join Filter: 4992268642
  ->  Gather  (cost=1000.00..58424.35 rows=473 width=3489) (actual time=0.684..14.158 rows=15123 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        ->  Parallel Seq Scan on TBL_RES  (cost=0.00..57377.05 rows=197 width=3489) (actual time=0.096..279.504 rows=5041 loops=3)
              Filter: ((CID >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (CID <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)) AND (COD >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (COD <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)))
              Rows Removed by Filter: 161714
  ->  Materialize  (cost=253388.44..394112.08 rows=1 width=56) (actual time=0.081..26.426 rows=330111 loops=15123)
        ->  Hash Join  (cost=253388.44..394112.07 rows=1 width=56) (actual time=1197.484..2954.084 rows=330111 loops=1)
              Hash Cond: ((T_CUST.RSNO = T_CUST_1.RSNO) AND ((T_CUST.KNO)::text = (T_CUST_1.KNO)::text) AND (T_CUST.gstseq = (min(T_CUST_1.gstseq))))
              ->  Seq Scan on TBL_CUST T_CUST  (cost=0.00..79431.15 rows=2000315 width=61) (actual time=0.015..561.005 rows=2000752 loops=1)
              ->  Hash  (cost=246230.90..246230.90 rows=262488 width=50) (actual time=1197.025..1209.957 rows=330111 loops=1)
                    Buckets: 65536  Batches: 8  Memory Usage: 2773kB
                    ->  Finalize GroupAggregate  (cost=205244.84..243606.02 rows=262488 width=50) (actual time=788.552..1116.074 rows=330111 loops=1)
                          Group Key: T_CUST_1.RSNO, T_CUST_1.KNO
                          ->  Gather Merge  (cost=205244.84..238964.80 rows=268846 width=50) (actual time=788.547..982.479 rows=330111 loops=1)
                                Workers Planned: 2
                                Workers Launched: 1
                                ->  Partial GroupAggregate  (cost=204244.81..206933.27 rows=134423 width=50) (actual time=784.032..900.979 rows=165056 loops=2)
                                      Group Key: T_CUST_1.RSNO, T_CUST_1.KNO
                                      ->  Sort  (cost=204244.81..204580.87 rows=134423 width=23) (actual time=784.019..833.791 rows=165061 loops=2)
                                            Sort Key: T_CUST_1.RSNO, T_CUST_1.KNO
                                            Sort Method: external merge  Disk: 5480kB
                                            Worker 0:  Sort Method: external merge  Disk: 5520kB
                                            ->  Parallel Hash Join  (cost=111758.80..190036.38 rows=134423 width=23) (actual time=645.302..716.247 rows=165061 loops=2)
                                                  Hash Cond: (T_CUST_1.RSNO = T_RES.RSNO)
                                                  ->  Parallel Seq Scan on TBL_CUST T_CUST_1  (cost=0.00..74013.63 rows=204760 width=23) (actual time=0.018..264.390 rows=165058 loops=2)
                                                        Filter: ((dispseq <> '9999'::numeric) AND ((stsflg)::text = 'T'::text) AND ((KFIX)::text = '0'::text))
                                                        Rows Removed by Filter: 835318
                                                  ->  Parallel Hash  (cost=109508.52..109508.52 rows=137142 width=8) (actual time=343.593..343.896 rows=165058 loops=2)
                                                        Buckets: 131072  Batches: 8  Memory Usage: 3008kB
                                                        ->  Parallel Hash Join  (cost=51834.70..109508.52 rows=137142 width=8) (actual time=256.732..314.368 rows=165058 loops=2)
                                                              Hash Cond: ((T_RES.crsno)::text = (T_POV.crsno)::text)
                                                              ->  Parallel Seq Scan on TBL_RES T_RES  (cost=0.00..53199.02 rows=208902 width=17) (actual time=0.007..100.510 rows=250132 loops=2)
                                                              ->  Parallel Hash  (cost=49450.42..49450.42 rows=137142 width=9) (actual time=122.308..122.309 rows=165054 loops=2)
                                                                    Buckets: 131072  Batches: 8  Memory Usage: 2976kB
                                                                    ->  Parallel Seq Scan on TBL_POV T_POV  (cost=0.00..49450.42 rows=137142 width=9) (actual time=0.037..89.470 rows=165054 loops=2)
Planning Time: 1.064 ms
Execution Time: 996062.382 ms
--------------------------------------------------------------------------------

The amount of data in the table is as follows.
TBL_RES    500265
TBL_CUST   2000752
TBL_POV    330109

Any suggestions for improving the performance of the query would be greatly appreciated.

Thanks in advance!



--
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]


--
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]

I made some slight changes to the SQL you provided, but the optimization approach remained the same. 

I was surprised that the results were retrieved in less than one second. It's really impressive!

Below is the execution plan. Thank you very much for providing the optimization method, I learned a lot from it.


explain analyse

with t_res as

(select RSNO, KNO, CRSNO

    from tbl_res

   where tbl_res.CID >= to_date('2022/07/01', 'YYYY/MM/DD')

     and tbl_res.CID <= to_date('2022/07/31', 'YYYY/MM/DD')

     and tbl_res.COD >= to_date('2022/07/01', 'YYYY/MM/DD')

     and tbl_res.COD <= to_date('2022/07/31', 'YYYY/MM/DD')),

t_pov2 as

(select t_cust.RSNO, t_cust.KNO, MIN(t_cust.GSTSEQ) GSTSEQ

    from t_res -- this is tbl_res already filter by date

   inner join tbl_cust t_cust

      on t_res.RSNO = t_cust.RSNO

   inner join tbl_pov t_pov

      on t_pov.CRSNO = t_res.CRSNO -- why you use this table? it doesn't seem to be used to extract data. Are you trying to extract data from t_res that have at least a record in t_pov? in this case could work better move this join in the first with (using distinct or group by to ensure there will be just a record for RSNO and KNO)

   where t_cust.STSFLG = 'T'

     and t_cust.DISPSEQ <> 9999

     AND t_cust.KFIX = '0'

   group by t_cust.RSNO, t_cust.KNO),

t_pov3 as

(select t_cust.RSNO RSNO2, t_cust.KNO, t_cust.AGE, t_cust.GST

    from tbl_cust t_cust

   inner join t_pov2

      on t_pov2.RSNO = t_cust.RSNO

     and t_pov2.KNO = t_cust.KNO

     and t_pov2.GSTSEQ = t_cust.GSTSEQ)

select *

  from t_res

  left outer join t_pov3

    on t_res.RSNO = t_pov3.RSNO2

   and t_res.KNO = t_pov3.KNO


----- execution plan -----

Hash Right Join  (cost=125923.21..132076.05 rows=472 width=164) (actual time=408.252..410.342 rows=15123 loops=1)

  Hash Cond: ((t_cust.RSNO = t_res.RSNO) AND ((t_cust.KNO)::text = (t_res.KNO)::text))

  CTE t_res

    ->  Gather  (cost=1000.00..58410.51 rows=472 width=27) (actual time=55.587..207.684 rows=15123 loops=1)

          Workers Planned: 2

          Workers Launched: 2

          ->  Parallel Seq Scan on tbl_res  (cost=0.00..57363.31 rows=197 width=27) (actual time=49.850..204.235 rows=5041 loops=3)

                Filter: ((CID >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (CID <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)) AND (COD >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (COD <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)))

                Rows Removed by Filter: 161714

  ->  Nested Loop  (cost=67496.18..73648.88 rows=1 width=56) (actual time=191.880..191.924 rows=11 loops=1)

        ->  GroupAggregate  (cost=67495.75..67510.49 rows=737 width=50) (actual time=191.869..191.878 rows=11 loops=1)

              Group Key: t_cust_1.RSNO, t_cust_1.KNO

              ->  Sort  (cost=67495.75..67497.59 rows=737 width=23) (actual time=191.859..191.862 rows=13 loops=1)

                    Sort Key: t_cust_1.RSNO, t_cust_1.KNO

                    Sort Method: quicksort  Memory: 26kB

                    ->  Nested Loop  (cost=57118.88..67460.65 rows=737 width=23) (actual time=172.185..191.837 rows=13 loops=1)

                          ->  Hash Join  (cost=57118.45..58758.38 rows=472 width=14) (actual time=172.154..191.647 rows=13 loops=1)

                                Hash Cond: ((t_res_1.crsno)::text = (t_pov.crsno)::text)

                                ->  CTE Scan on t_res t_res_1  (cost=0.00..9.44 rows=472 width=72) (actual time=0.003..1.445 rows=15123 loops=1)

                                ->  Hash  (cost=51380.09..51380.09 rows=330109 width=9) (actual time=170.350..170.350 rows=330109 loops=1)

                                      Buckets: 131072  Batches: 8  Memory Usage: 2707kB

                                      ->  Seq Scan on tbl_pov t_pov  (cost=0.00..51380.09 rows=330109 width=9) (actual time=0.029..124.632 rows=330109 loops=1)

                          ->  Index Scan using tbl_cust_pk on tbl_cust t_cust_1  (cost=0.43..18.42 rows=2 width=23) (actual time=0.011..0.012 rows=1 loops=13)

                                Index Cond: (RSNO = t_res_1.RSNO)

                                Filter: ((dispseq <> '9999'::numeric) AND ((stsflg)::text = 'T'::text) AND ((KFIX)::text = '0'::text))

                                Rows Removed by Filter: 2

        ->  Index Scan using tbl_cust_pk on tbl_cust t_cust  (cost=0.43..8.31 rows=1 width=61) (actual time=0.003..0.003 rows=1 loops=11)

              Index Cond: ((RSNO = t_cust_1.RSNO) AND (gstseq = (min(t_cust_1.gstseq))))

              Filter: ((t_cust_1.KNO)::text = (KNO)::text)

  ->  Hash  (cost=9.44..9.44 rows=472 width=108) (actual time=216.361..216.361 rows=15123 loops=1)

        Buckets: 16384 (originally 1024)  Batches: 1 (originally 1)  Memory Usage: 882kB

        ->  CTE Scan on t_res  (cost=0.00..9.44 rows=472 width=108) (actual time=55.591..211.698 rows=15123 loops=1)

Planning Time: 1.417 ms

Execution Time: 411.019 ms

--------------------------------------------------------------------------------






At 2023-06-05 22:53:56, "Lorusso Domenico" <domenico.l76@gmail.com> wrote:

try this (there is some comment)

with t_res as (
select RSNO,  KNO
from TBL_RES
where TBL_RES.CID >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.CID <= to_date('2022/07/31', 'YYYY/MM/DD')
and TBL_RES.COD >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.COD <= to_date('2022/07/31', 'YYYY/MM/DD')
), t_pov2 as (
select T_CUST.RSNO ,
T_CUST.KNO ,
MIN(T_CUST.GSTSEQ) GSTSEQ
from T_RES -- this is tbl_res already filter by date
inner join TBL_CUST T_CUST on T_RES.RSNO = T_CUST.RSNO
inner join TBL_POV T_POV on T_POV.CRSNO = T_RES.CRSNO -- why you use this table? it doesn't seem to be used to extract data. Are you trying to extract data from T_RES that have at least a record in T_POV? in this case could work better move this join in the first with (using distinct or group by to ensure there will be just a record for rsno and kno)
where T_CUST.STSFLG = 'T'
and T_CUST.DISPSEQ <> 9999
AND T_CUST.KFIX = '0'
group by T_CUST.RSNO , T_CUST.KNO
), t_pov3 as (
select T_CUST.RSNO RSNO2 ,
T_CUST.KNO ,
T_CUST.AGE ,
T_CUST.GST
from TBL_CUST T_CUST
inner join t_pov2 on T_POV2.RSNO = T_CUST.RSNO
and T_POV2.KNO = T_CUST.KNO
and T_POV2.GSTSEQ = T_CUST.GSTSEQ
)
select *
from TBL_RES
left outer join t_pov3 on TBL_RES.RSNO = T_POV3.RSNO2
and TBL_RES.KNO = T_POV3.KNO       

Il giorno lun 5 giu 2023 alle ore 12:06 gzh <gzhcoder@126.com> ha scritto:

Thank you very much for taking the time to reply to my question. 

I followed your suggestion and rewrote the SQL using Common Table Expression (CTE). 

Unfortunately, there was no significant improvement in performance.



At 2023-06-05 17:47:25, "Lorusso Domenico" <domenico.l76@gmail.com> wrote:

Hello,
In many case a formal writing and usage of with statement could solve the issue.
If you need join, use always join:
where T_POV2.RSNO = T_CUST.RSNO
and T_POV2.KNO = T_CUST.KNO
and T_POV2.GSTSEQ = T_CUST.GSTSEQ)
this is an inner join.

I mean something like this
with t_pov2 as (
select T_CUST.RSNO ,
T_CUST.KNO ,
MIN(T_CUST.GSTSEQ) GSTSEQ
from TBL_CUST T_CUST ,
TBL_POV T_POV ,
TBL_RES T_RES
where T_CUST.STSFLG = 'T'
and T_CUST.DISPSEQ <> 9999
AND T_CUST.KFIX = '0'
and T_POV.CRSNO = T_RES.CRSNO
and T_RES.RSNO = T_CUST.RSNO
group by T_CUST.RSNO , T_CUST.KNO
), t_pov3 as (
select T_CUST.RSNO RSNO2 ,
T_CUST.KNO ,
T_CUST.AGE ,
T_CUST.GST
from TBL_CUST T_CUST
inner join t_pov2 on T_POV2.RSNO = T_CUST.RSNO
and T_POV2.KNO = T_CUST.KNO
and T_POV2.GSTSEQ = T_CUST.GSTSEQ
)

select *
from TBL_RES
left outer join t_pov3 on TBL_RES.RSNO = T_POV3.RSNO2
and TBL_RES.KNO = T_POV3.KNO
where TBL_RES.CID >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.CID <= to_date('2022/07/31', 'YYYY/MM/DD')
and TBL_RES.COD >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.COD <= to_date('2022/07/31', 'YYYY/MM/DD')

but if tbl_res contain lessere record a good idea is start from this table and use in join with other


Il giorno lun 5 giu 2023 alle ore 08:57 gzh <gzhcoder@126.com> ha scritto:

Hi everyone,

I'm running into some performance issues with my SQL query.
The following SQL query is taking a long time to execute.

Execution Plan:
explain analyse
select * from TBL_RES
left outer join(select T_CUST.RSNO RSNO2 ,
T_CUST.KNO ,
T_CUST.AGE ,
T_CUST.GST
from TBL_CUST T_CUST ,
(select T_CUST.RSNO ,
T_CUST.KNO ,
MIN(T_CUST.GSTSEQ) GSTSEQ
from TBL_CUST T_CUST ,
TBL_POV T_POV ,
TBL_RES T_RES
where T_CUST.STSFLG = 'T'
and T_CUST.DISPSEQ <> 9999
AND T_CUST.KFIX = '0'
and T_POV.CRSNO = T_RES.CRSNO
and T_RES.RSNO = T_CUST.RSNO
group by T_CUST.RSNO , T_CUST.KNO) T_POV2
where T_POV2.RSNO = T_CUST.RSNO
and T_POV2.KNO = T_CUST.KNO
and T_POV2.GSTSEQ = T_CUST.GSTSEQ) T_POV3 on TBL_RES.RSNO = T_POV3.RSNO2
and TBL_RES.KNO = T_POV3.KNO
where TBL_RES.CID >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.CID <= to_date('2022/07/31', 'YYYY/MM/DD')
and TBL_RES.COD >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.COD <= to_date('2022/07/31', 'YYYY/MM/DD')
----- Execution Plan -----
Nested Loop Left Join  (cost=254388.44..452544.70 rows=473 width=3545) (actual time=3077.312..996048.714 rows=15123 loops=1)
  Join Filter: ((TBL_RES.RSNO = T_CUST.RSNO) AND ((TBL_RES.KNO)::text = (T_CUST.KNO)::text))
  Rows Removed by Join Filter: 4992268642
  ->  Gather  (cost=1000.00..58424.35 rows=473 width=3489) (actual time=0.684..14.158 rows=15123 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        ->  Parallel Seq Scan on TBL_RES  (cost=0.00..57377.05 rows=197 width=3489) (actual time=0.096..279.504 rows=5041 loops=3)
              Filter: ((CID >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (CID <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)) AND (COD >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (COD <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)))
              Rows Removed by Filter: 161714
  ->  Materialize  (cost=253388.44..394112.08 rows=1 width=56) (actual time=0.081..26.426 rows=330111 loops=15123)
        ->  Hash Join  (cost=253388.44..394112.07 rows=1 width=56) (actual time=1197.484..2954.084 rows=330111 loops=1)
              Hash Cond: ((T_CUST.RSNO = T_CUST_1.RSNO) AND ((T_CUST.KNO)::text = (T_CUST_1.KNO)::text) AND (T_CUST.gstseq = (min(T_CUST_1.gstseq))))
              ->  Seq Scan on TBL_CUST T_CUST  (cost=0.00..79431.15 rows=2000315 width=61) (actual time=0.015..561.005 rows=2000752 loops=1)
              ->  Hash  (cost=246230.90..246230.90 rows=262488 width=50) (actual time=1197.025..1209.957 rows=330111 loops=1)
                    Buckets: 65536  Batches: 8  Memory Usage: 2773kB
                    ->  Finalize GroupAggregate  (cost=205244.84..243606.02 rows=262488 width=50) (actual time=788.552..1116.074 rows=330111 loops=1)
                          Group Key: T_CUST_1.RSNO, T_CUST_1.KNO
                          ->  Gather Merge  (cost=205244.84..238964.80 rows=268846 width=50) (actual time=788.547..982.479 rows=330111 loops=1)
                                Workers Planned: 2
                                Workers Launched: 1
                                ->  Partial GroupAggregate  (cost=204244.81..206933.27 rows=134423 width=50) (actual time=784.032..900.979 rows=165056 loops=2)
                                      Group Key: T_CUST_1.RSNO, T_CUST_1.KNO
                                      ->  Sort  (cost=204244.81..204580.87 rows=134423 width=23) (actual time=784.019..833.791 rows=165061 loops=2)
                                            Sort Key: T_CUST_1.RSNO, T_CUST_1.KNO
                                            Sort Method: external merge  Disk: 5480kB
                                            Worker 0:  Sort Method: external merge  Disk: 5520kB
                                            ->  Parallel Hash Join  (cost=111758.80..190036.38 rows=134423 width=23) (actual time=645.302..716.247 rows=165061 loops=2)
                                                  Hash Cond: (T_CUST_1.RSNO = T_RES.RSNO)
                                                  ->  Parallel Seq Scan on TBL_CUST T_CUST_1  (cost=0.00..74013.63 rows=204760 width=23) (actual time=0.018..264.390 rows=165058 loops=2)
                                                        Filter: ((dispseq <> '9999'::numeric) AND ((stsflg)::text = 'T'::text) AND ((KFIX)::text = '0'::text))
                                                        Rows Removed by Filter: 835318
                                                  ->  Parallel Hash  (cost=109508.52..109508.52 rows=137142 width=8) (actual time=343.593..343.896 rows=165058 loops=2)
                                                        Buckets: 131072  Batches: 8  Memory Usage: 3008kB
                                                        ->  Parallel Hash Join  (cost=51834.70..109508.52 rows=137142 width=8) (actual time=256.732..314.368 rows=165058 loops=2)
                                                              Hash Cond: ((T_RES.crsno)::text = (T_POV.crsno)::text)
                                                              ->  Parallel Seq Scan on TBL_RES T_RES  (cost=0.00..53199.02 rows=208902 width=17) (actual time=0.007..100.510 rows=250132 loops=2)
                                                              ->  Parallel Hash  (cost=49450.42..49450.42 rows=137142 width=9) (actual time=122.308..122.309 rows=165054 loops=2)
                                                                    Buckets: 131072  Batches: 8  Memory Usage: 2976kB
                                                                    ->  Parallel Seq Scan on TBL_POV T_POV  (cost=0.00..49450.42 rows=137142 width=9) (actual time=0.037..89.470 rows=165054 loops=2)
Planning Time: 1.064 ms
Execution Time: 996062.382 ms
--------------------------------------------------------------------------------

The amount of data in the table is as follows.
TBL_RES    500265
TBL_CUST   2000752
TBL_POV    330109

Any suggestions for improving the performance of the query would be greatly appreciated.

Thanks in advance!



--
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]


--
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]

Re: Is there any good optimization solution to improve the query efficiency?

From
Lorusso Domenico
Date:
I'm happy to help.
Looking the explanation the first with on t_res goes in parallel full table scan... this is an issue.
Should be present an index on the temporale period (I'm just looking for the same problem)

Il giorno mar 6 giu 2023 alle ore 10:33 gzh <gzhcoder@126.com> ha scritto:

I made some slight changes to the SQL you provided, but the optimization approach remained the same. 

I was surprised that the results were retrieved in less than one second. It's really impressive!

Below is the execution plan. Thank you very much for providing the optimization method, I learned a lot from it.


explain analyse

with t_res as

(select RSNO, KNO, CRSNO

    from tbl_res

   where tbl_res.CID >= to_date('2022/07/01', 'YYYY/MM/DD')

     and tbl_res.CID <= to_date('2022/07/31', 'YYYY/MM/DD')

     and tbl_res.COD >= to_date('2022/07/01', 'YYYY/MM/DD')

     and tbl_res.COD <= to_date('2022/07/31', 'YYYY/MM/DD')),

t_pov2 as

(select t_cust.RSNO, t_cust.KNO, MIN(t_cust.GSTSEQ) GSTSEQ

    from t_res -- this is tbl_res already filter by date

   inner join tbl_cust t_cust

      on t_res.RSNO = t_cust.RSNO

   inner join tbl_pov t_pov

      on t_pov.CRSNO = t_res.CRSNO -- why you use this table? it doesn't seem to be used to extract data. Are you trying to extract data from t_res that have at least a record in t_pov? in this case could work better move this join in the first with (using distinct or group by to ensure there will be just a record for RSNO and KNO)

   where t_cust.STSFLG = 'T'

     and t_cust.DISPSEQ <> 9999

     AND t_cust.KFIX = '0'

   group by t_cust.RSNO, t_cust.KNO),

t_pov3 as

(select t_cust.RSNO RSNO2, t_cust.KNO, t_cust.AGE, t_cust.GST

    from tbl_cust t_cust

   inner join t_pov2

      on t_pov2.RSNO = t_cust.RSNO

     and t_pov2.KNO = t_cust.KNO

     and t_pov2.GSTSEQ = t_cust.GSTSEQ)

select *

  from t_res

  left outer join t_pov3

    on t_res.RSNO = t_pov3.RSNO2

   and t_res.KNO = t_pov3.KNO


----- execution plan -----

Hash Right Join  (cost=125923.21..132076.05 rows=472 width=164) (actual time=408.252..410.342 rows=15123 loops=1)

  Hash Cond: ((t_cust.RSNO = t_res.RSNO) AND ((t_cust.KNO)::text = (t_res.KNO)::text))

  CTE t_res

    ->  Gather  (cost=1000.00..58410.51 rows=472 width=27) (actual time=55.587..207.684 rows=15123 loops=1)

          Workers Planned: 2

          Workers Launched: 2

          ->  Parallel Seq Scan on tbl_res  (cost=0.00..57363.31 rows=197 width=27) (actual time=49.850..204.235 rows=5041 loops=3)

                Filter: ((CID >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (CID <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)) AND (COD >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (COD <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)))

                Rows Removed by Filter: 161714

  ->  Nested Loop  (cost=67496.18..73648.88 rows=1 width=56) (actual time=191.880..191.924 rows=11 loops=1)

        ->  GroupAggregate  (cost=67495.75..67510.49 rows=737 width=50) (actual time=191.869..191.878 rows=11 loops=1)

              Group Key: t_cust_1.RSNO, t_cust_1.KNO

              ->  Sort  (cost=67495.75..67497.59 rows=737 width=23) (actual time=191.859..191.862 rows=13 loops=1)

                    Sort Key: t_cust_1.RSNO, t_cust_1.KNO

                    Sort Method: quicksort  Memory: 26kB

                    ->  Nested Loop  (cost=57118.88..67460.65 rows=737 width=23) (actual time=172.185..191.837 rows=13 loops=1)

                          ->  Hash Join  (cost=57118.45..58758.38 rows=472 width=14) (actual time=172.154..191.647 rows=13 loops=1)

                                Hash Cond: ((t_res_1.crsno)::text = (t_pov.crsno)::text)

                                ->  CTE Scan on t_res t_res_1  (cost=0.00..9.44 rows=472 width=72) (actual time=0.003..1.445 rows=15123 loops=1)

                                ->  Hash  (cost=51380.09..51380.09 rows=330109 width=9) (actual time=170.350..170.350 rows=330109 loops=1)

                                      Buckets: 131072  Batches: 8  Memory Usage: 2707kB

                                      ->  Seq Scan on tbl_pov t_pov  (cost=0.00..51380.09 rows=330109 width=9) (actual time=0.029..124.632 rows=330109 loops=1)

                          ->  Index Scan using tbl_cust_pk on tbl_cust t_cust_1  (cost=0.43..18.42 rows=2 width=23) (actual time=0.011..0.012 rows=1 loops=13)

                                Index Cond: (RSNO = t_res_1.RSNO)

                                Filter: ((dispseq <> '9999'::numeric) AND ((stsflg)::text = 'T'::text) AND ((KFIX)::text = '0'::text))

                                Rows Removed by Filter: 2

        ->  Index Scan using tbl_cust_pk on tbl_cust t_cust  (cost=0.43..8.31 rows=1 width=61) (actual time=0.003..0.003 rows=1 loops=11)

              Index Cond: ((RSNO = t_cust_1.RSNO) AND (gstseq = (min(t_cust_1.gstseq))))

              Filter: ((t_cust_1.KNO)::text = (KNO)::text)

  ->  Hash  (cost=9.44..9.44 rows=472 width=108) (actual time=216.361..216.361 rows=15123 loops=1)

        Buckets: 16384 (originally 1024)  Batches: 1 (originally 1)  Memory Usage: 882kB

        ->  CTE Scan on t_res  (cost=0.00..9.44 rows=472 width=108) (actual time=55.591..211.698 rows=15123 loops=1)

Planning Time: 1.417 ms

Execution Time: 411.019 ms

--------------------------------------------------------------------------------






At 2023-06-05 22:53:56, "Lorusso Domenico" <domenico.l76@gmail.com> wrote:

try this (there is some comment)

with t_res as (
select RSNO,  KNO
from TBL_RES
where TBL_RES.CID >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.CID <= to_date('2022/07/31', 'YYYY/MM/DD')
and TBL_RES.COD >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.COD <= to_date('2022/07/31', 'YYYY/MM/DD')
), t_pov2 as (
select T_CUST.RSNO ,
T_CUST.KNO ,
MIN(T_CUST.GSTSEQ) GSTSEQ
from T_RES -- this is tbl_res already filter by date
inner join TBL_CUST T_CUST on T_RES.RSNO = T_CUST.RSNO
inner join TBL_POV T_POV on T_POV.CRSNO = T_RES.CRSNO -- why you use this table? it doesn't seem to be used to extract data. Are you trying to extract data from T_RES that have at least a record in T_POV? in this case could work better move this join in the first with (using distinct or group by to ensure there will be just a record for rsno and kno)
where T_CUST.STSFLG = 'T'
and T_CUST.DISPSEQ <> 9999
AND T_CUST.KFIX = '0'
group by T_CUST.RSNO , T_CUST.KNO
), t_pov3 as (
select T_CUST.RSNO RSNO2 ,
T_CUST.KNO ,
T_CUST.AGE ,
T_CUST.GST
from TBL_CUST T_CUST
inner join t_pov2 on T_POV2.RSNO = T_CUST.RSNO
and T_POV2.KNO = T_CUST.KNO
and T_POV2.GSTSEQ = T_CUST.GSTSEQ
)
select *
from TBL_RES
left outer join t_pov3 on TBL_RES.RSNO = T_POV3.RSNO2
and TBL_RES.KNO = T_POV3.KNO       

Il giorno lun 5 giu 2023 alle ore 12:06 gzh <gzhcoder@126.com> ha scritto:

Thank you very much for taking the time to reply to my question. 

I followed your suggestion and rewrote the SQL using Common Table Expression (CTE). 

Unfortunately, there was no significant improvement in performance.



At 2023-06-05 17:47:25, "Lorusso Domenico" <domenico.l76@gmail.com> wrote:

Hello,
In many case a formal writing and usage of with statement could solve the issue.
If you need join, use always join:
where T_POV2.RSNO = T_CUST.RSNO
and T_POV2.KNO = T_CUST.KNO
and T_POV2.GSTSEQ = T_CUST.GSTSEQ)
this is an inner join.

I mean something like this
with t_pov2 as (
select T_CUST.RSNO ,
T_CUST.KNO ,
MIN(T_CUST.GSTSEQ) GSTSEQ
from TBL_CUST T_CUST ,
TBL_POV T_POV ,
TBL_RES T_RES
where T_CUST.STSFLG = 'T'
and T_CUST.DISPSEQ <> 9999
AND T_CUST.KFIX = '0'
and T_POV.CRSNO = T_RES.CRSNO
and T_RES.RSNO = T_CUST.RSNO
group by T_CUST.RSNO , T_CUST.KNO
), t_pov3 as (
select T_CUST.RSNO RSNO2 ,
T_CUST.KNO ,
T_CUST.AGE ,
T_CUST.GST
from TBL_CUST T_CUST
inner join t_pov2 on T_POV2.RSNO = T_CUST.RSNO
and T_POV2.KNO = T_CUST.KNO
and T_POV2.GSTSEQ = T_CUST.GSTSEQ
)

select *
from TBL_RES
left outer join t_pov3 on TBL_RES.RSNO = T_POV3.RSNO2
and TBL_RES.KNO = T_POV3.KNO
where TBL_RES.CID >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.CID <= to_date('2022/07/31', 'YYYY/MM/DD')
and TBL_RES.COD >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.COD <= to_date('2022/07/31', 'YYYY/MM/DD')

but if tbl_res contain lessere record a good idea is start from this table and use in join with other


Il giorno lun 5 giu 2023 alle ore 08:57 gzh <gzhcoder@126.com> ha scritto:

Hi everyone,

I'm running into some performance issues with my SQL query.
The following SQL query is taking a long time to execute.

Execution Plan:
explain analyse
select * from TBL_RES
left outer join(select T_CUST.RSNO RSNO2 ,
T_CUST.KNO ,
T_CUST.AGE ,
T_CUST.GST
from TBL_CUST T_CUST ,
(select T_CUST.RSNO ,
T_CUST.KNO ,
MIN(T_CUST.GSTSEQ) GSTSEQ
from TBL_CUST T_CUST ,
TBL_POV T_POV ,
TBL_RES T_RES
where T_CUST.STSFLG = 'T'
and T_CUST.DISPSEQ <> 9999
AND T_CUST.KFIX = '0'
and T_POV.CRSNO = T_RES.CRSNO
and T_RES.RSNO = T_CUST.RSNO
group by T_CUST.RSNO , T_CUST.KNO) T_POV2
where T_POV2.RSNO = T_CUST.RSNO
and T_POV2.KNO = T_CUST.KNO
and T_POV2.GSTSEQ = T_CUST.GSTSEQ) T_POV3 on TBL_RES.RSNO = T_POV3.RSNO2
and TBL_RES.KNO = T_POV3.KNO
where TBL_RES.CID >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.CID <= to_date('2022/07/31', 'YYYY/MM/DD')
and TBL_RES.COD >= to_date('2022/07/01', 'YYYY/MM/DD')
and TBL_RES.COD <= to_date('2022/07/31', 'YYYY/MM/DD')
----- Execution Plan -----
Nested Loop Left Join  (cost=254388.44..452544.70 rows=473 width=3545) (actual time=3077.312..996048.714 rows=15123 loops=1)
  Join Filter: ((TBL_RES.RSNO = T_CUST.RSNO) AND ((TBL_RES.KNO)::text = (T_CUST.KNO)::text))
  Rows Removed by Join Filter: 4992268642
  ->  Gather  (cost=1000.00..58424.35 rows=473 width=3489) (actual time=0.684..14.158 rows=15123 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        ->  Parallel Seq Scan on TBL_RES  (cost=0.00..57377.05 rows=197 width=3489) (actual time=0.096..279.504 rows=5041 loops=3)
              Filter: ((CID >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (CID <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)) AND (COD >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (COD <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)))
              Rows Removed by Filter: 161714
  ->  Materialize  (cost=253388.44..394112.08 rows=1 width=56) (actual time=0.081..26.426 rows=330111 loops=15123)
        ->  Hash Join  (cost=253388.44..394112.07 rows=1 width=56) (actual time=1197.484..2954.084 rows=330111 loops=1)
              Hash Cond: ((T_CUST.RSNO = T_CUST_1.RSNO) AND ((T_CUST.KNO)::text = (T_CUST_1.KNO)::text) AND (T_CUST.gstseq = (min(T_CUST_1.gstseq))))
              ->  Seq Scan on TBL_CUST T_CUST  (cost=0.00..79431.15 rows=2000315 width=61) (actual time=0.015..561.005 rows=2000752 loops=1)
              ->  Hash  (cost=246230.90..246230.90 rows=262488 width=50) (actual time=1197.025..1209.957 rows=330111 loops=1)
                    Buckets: 65536  Batches: 8  Memory Usage: 2773kB
                    ->  Finalize GroupAggregate  (cost=205244.84..243606.02 rows=262488 width=50) (actual time=788.552..1116.074 rows=330111 loops=1)
                          Group Key: T_CUST_1.RSNO, T_CUST_1.KNO
                          ->  Gather Merge  (cost=205244.84..238964.80 rows=268846 width=50) (actual time=788.547..982.479 rows=330111 loops=1)
                                Workers Planned: 2
                                Workers Launched: 1
                                ->  Partial GroupAggregate  (cost=204244.81..206933.27 rows=134423 width=50) (actual time=784.032..900.979 rows=165056 loops=2)
                                      Group Key: T_CUST_1.RSNO, T_CUST_1.KNO
                                      ->  Sort  (cost=204244.81..204580.87 rows=134423 width=23) (actual time=784.019..833.791 rows=165061 loops=2)
                                            Sort Key: T_CUST_1.RSNO, T_CUST_1.KNO
                                            Sort Method: external merge  Disk: 5480kB
                                            Worker 0:  Sort Method: external merge  Disk: 5520kB
                                            ->  Parallel Hash Join  (cost=111758.80..190036.38 rows=134423 width=23) (actual time=645.302..716.247 rows=165061 loops=2)
                                                  Hash Cond: (T_CUST_1.RSNO = T_RES.RSNO)
                                                  ->  Parallel Seq Scan on TBL_CUST T_CUST_1  (cost=0.00..74013.63 rows=204760 width=23) (actual time=0.018..264.390 rows=165058 loops=2)
                                                        Filter: ((dispseq <> '9999'::numeric) AND ((stsflg)::text = 'T'::text) AND ((KFIX)::text = '0'::text))
                                                        Rows Removed by Filter: 835318
                                                  ->  Parallel Hash  (cost=109508.52..109508.52 rows=137142 width=8) (actual time=343.593..343.896 rows=165058 loops=2)
                                                        Buckets: 131072  Batches: 8  Memory Usage: 3008kB
                                                        ->  Parallel Hash Join  (cost=51834.70..109508.52 rows=137142 width=8) (actual time=256.732..314.368 rows=165058 loops=2)
                                                              Hash Cond: ((T_RES.crsno)::text = (T_POV.crsno)::text)
                                                              ->  Parallel Seq Scan on TBL_RES T_RES  (cost=0.00..53199.02 rows=208902 width=17) (actual time=0.007..100.510 rows=250132 loops=2)
                                                              ->  Parallel Hash  (cost=49450.42..49450.42 rows=137142 width=9) (actual time=122.308..122.309 rows=165054 loops=2)
                                                                    Buckets: 131072  Batches: 8  Memory Usage: 2976kB
                                                                    ->  Parallel Seq Scan on TBL_POV T_POV  (cost=0.00..49450.42 rows=137142 width=9) (actual time=0.037..89.470 rows=165054 loops=2)
Planning Time: 1.064 ms
Execution Time: 996062.382 ms
--------------------------------------------------------------------------------

The amount of data in the table is as follows.
TBL_RES    500265
TBL_CUST   2000752
TBL_POV    330109

Any suggestions for improving the performance of the query would be greatly appreciated.

Thanks in advance!



--
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]


--
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]


--
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]