RE: How to solve my slow disk i/o throughput during index scan - Mailing list pgsql-performance

From FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE TGV / DM RMP YIELD MANAGEMENT)
Subject RE: How to solve my slow disk i/o throughput during index scan
Date
Msg-id MR1P264MB34739538259E7FAF3A9DE4249DDE2@MR1P264MB3473.FRAP264.PROD.OUTLOOK.COM
Whole thread Raw
In response to Re: How to solve my slow disk i/o throughput during index scan  (Andrei Lepikhov <lepihov@gmail.com>)
Responses Re: How to solve my slow disk i/o throughput during index scan
List pgsql-performance

Hello,

Thank you, splitting in “OR” query definitely enables bitmap heap scans, and thus parallelized read to disk
😃 ! I though did not understand your second point, what is parallel append, and how to enable it ?

 

Simon F.

 


Interne
De : Andrei Lepikhov <lepihov@gmail.com>
Envoyé : jeudi 4 juillet 2024 16:37
À : FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE TGV / DM RMP YIELD MANAGEMENT) <simon.freyburger@sncf.fr>; pgsql-performance@lists.postgresql.org; Peter Geoghegan <pg@bowt.ie>
Objet : Re: How to solve my slow disk i/o throughput during index scan

 

On 4/7/2024 20: 25, FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE TGV / DM RMP YIELD MANAGEMENT) wrote: > *My question is : what can I change to get a better index reading speed ?* > > What I already tried : > > * Setting

On 4/7/2024 20:25, FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE 
TGV / DM RMP YIELD MANAGEMENT) wrote:
> *My question is : what can I change to get a better index reading speed ?*
> 
> What I already tried :
> 
>   * Setting random_page_cost to prohibitive value (10000000) to force a
>     bitmap heap scan, because those can be made in parallel. This has
>     not worked, the optimizer is still doing an index scan on my fact table.
>   * Change effective_io_concurrency, max_parallel_workers_per_gather,
>     work_mem to much higher values.
I'm not sure the case is only about speed of index scan. Just see into 
slow Index clause:
fth.\"ID_TRAIN\" = ANY ('{17855,13945,536795,18838,18837,13574 ...
and many more values.
IndexScan need to make scan for each of these values and for each value 
go through the pages to check other conditions.
We already discuss some optimisations related to this case in couple of 
pgsql-hackers threads. But I'm not sure we have quick solution right now.
If you want to use BitmapScan (that might be reasonable to try here) - 
you need to split huge ANY (...) clause into sequence of ORs.
Also, may be parallel append could help here? if can change 
corresponding startup and tuple costs to force such a plan.
 
-- 
regards, Andrei Lepikhov
 
-------
Ce message et toutes les pièces jointes sont établis à l'intention exclusive de ses destinataires et sont confidentiels. L'intégrité de ce message n'étant pas assurée sur Internet, la SNCF ne peut être tenue responsable des altérations qui pourraient se produire sur son contenu. Toute publication, utilisation, reproduction, ou diffusion, même partielle, non autorisée préalablement par la SNCF, est strictement interdite. Si vous n'êtes pas le destinataire de ce message, merci d'en avertir immédiatement l'expéditeur et de le détruire.
-------
This message and any attachments are intended solely for the addressees and are confidential. SNCF may not be held responsible for their contents whose accuracy and completeness cannot be guaranteed over the Internet. Unauthorized use, disclosure, distribution, copying, or any part thereof is strictly prohibited. If you are not the intended recipient of this message, please notify the sender immediately and delete it.

pgsql-performance by date:

Previous
From: Andrei Lepikhov
Date:
Subject: Re: How to solve my slow disk i/o throughput during index scan
Next
From: James Pang
Date:
Subject: Re: Hash Right join and seq scan