Re: How to improve the performance of my SQL query? - Mailing list pgsql-general

From Alban Hertroys
Subject Re: How to improve the performance of my SQL query?
Date
Msg-id 14E94C37-BA5A-4A1B-93E4-4E06741BC2C1@gmail.com
Whole thread Raw
In response to Re: How to improve the performance of my SQL query?  ("Peter J. Holzer" <hjp-pgsql@hjp.at>)
List pgsql-general
> On 29 Jul 2023, at 10:59, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
>
> On 2023-07-26 15:46:16 +0800, gzh wrote:
>> SET enable_seqscan TO off;
> [...]
>>                    ->  Parallel Bitmap Heap Scan on tbl_sha  (cost=92112.45..2663789.14 rows=800650 width=18)
(actualtime=260.540..21442.169 rows=804500 loops=3) 
>>                          Recheck Cond: (ms_cd = 'MLD009'::bpchar)
>>                          Rows Removed by Index Recheck: 49
>>                          Filter: (etrys = '00000001'::bpchar)
>>                          Rows Removed by Filter: 295500
>>                          Heap Blocks: exact=13788 lossy=10565
>>                          ->  Bitmap Index Scan on index_search_04_mscd_cdate  (cost=0.00..91632.06 rows=3402599
width=0)(actual time=249.718..249.718 rows=3300000 loops=1) 
>>                                Index Cond: (ms_cd = 'MLD009'::bpchar)
>
> So now it's using index_search_04_mscd_cdate which contains only ms_cd
> (and - judging from the name, other fields not relevant to this query),
> but it still doesn't use index_search_01 which would fit the query
> exactly. I can understand that Postgres prefers a sequential scan over
> an index scan (the number of matching rows is about 10% of the total
> table size which is a lot), but why would it prefer a less specific
> index to a more specific one?
>
> Can you get Postgres to use that index at all?
>
> Find a combination of ms_cd and etrys which doesn't cover millions of
> rows and try that.
>
> Also try lowering random_page_cost.

Wasn’t this an RDS server with just 4GB of memory?

How large are those multi-column indices? Perhaps they don’t (all) fit into available cache memory and the server
decidedto use the one that it had cached? 

I’m frankly not at all certain how the server would behave around such resource shortage situations, but I suppose
loadingan uncached index into cache could get a higher cost than using a less optimal (costlier) index that’s already
cached.


Regarding lowering random_page_cost; If your index files are on SSD storage, lowering that sufficiently (to a realistic
value)could then sufficiently lower the cost of loading that uncached index into memory, evicting the index it was
usingin above plan to make room (unless other active sessions are using it). 

Alban Hertroys
--
There is always an exception to always.







pgsql-general by date:

Previous
From: "Peter J. Holzer"
Date:
Subject: Re: How to improve the performance of my SQL query?
Next
From: Philip Warner
Date:
Subject: Timeout in Logical Replication