Am 20.09.2018 um 13:11 schrieb Durgamahesh Manne: > Query was executed at less time without distinct > > As well as query was taking around 7 minutes to complete execution > with distinct > > select distinct sub_head."vchSubmittersCode" ,rec."vchFileName" , > rec."vchCusipFundIDSubFundID" , rec."vchFundUnitPrice" , > sub_head."vchValuationDate" ,rec."vchAdvisorLabel" from table1 rec > join table2 sub_head on > rec."vchSubmittersCode"=sub_head."vchSubmittersCode" where > rec."bFetch"=false and sub_head."bFetch"=false ; > > I need to execute above distinct query at less time as distinct query > was taking more time to execute even i have created indexes on > required columns of the tables >
as you can see: there are 40.500.000 rows to sort to filter out duplicate rows, the result contains 'only' 3.700.000 rows. But for this step the database needs nearly 4TB on-disk. This will, of course, need some time.
If you have enough ram you can try to set work_mem to 5 or 6 GB to change the plan to a in-memory - sort. But keep in mind, this is dangerous! If the machine don't have enough free ram the kernal can decide to Out-Of-Memory - killing processes.
What kind of disks do you have? Maybe you can use a separate fast SSD as temp_tablespaces?