Re: Track the amount of time waiting due to cost_delay - Mailing list pgsql-hackers
From | Bertrand Drouvot |
---|---|
Subject | Re: Track the amount of time waiting due to cost_delay |
Date | |
Msg-id | Znp/zyfvO1wwtGu6@ip-10-97-1-34.eu-west-3.compute.internal Whole thread Raw |
In response to | Re: Track the amount of time waiting due to cost_delay ("Imseih (AWS), Sami" <simseih@amazon.com>) |
List | pgsql-hackers |
Hi, On Tue, Jun 25, 2024 at 01:12:16AM +0000, Imseih (AWS), Sami wrote: Thanks for the feedback! > >> 2. the leader being interrupted while waiting is also already happening on master > >> due to the pgstat_progress_parallel_incr_param() calls in > >> parallel_vacuum_process_one_index() (that have been added in > >> 46ebdfe164). It has been the case "only" 36 times during my test case. > > 46ebdfe164 will interrupt the leaders sleep every time a parallel workers reports > progress, and we currently don't handle interrupts by restarting the sleep with > the remaining time. nanosleep does provide the ability to restart with the remaining > time [1], but I don't think it's worth the effort to ensure more accurate > vacuum delays for the leader process. +1. I don't think it's necessary to have a 100% accurate delay for all the times the delay is involded. I think that's an heuristic parameter (among with cost limit). What matters at the end is by how much you've been able to pause the whole vacuum (and not by a sleep by sleep basis)). > > 1. Having a time based only approach to throttle > > I do agree with a time based approach overall. > > > > 1.1) the more parallel workers is used, the less the impact of the leader on > > the vacuum index phase duration/workload is (because the repartition is done > > on more processes). > > Did you mean " because the vacuum is done on more processes"? Yes. > When a leader is operating on a large index(s) during the entirety > of the vacuum operation, wouldn't more parallel workers end up > interrupting the leader more often? That's right but my point was about the impact on the "whole" duration time and "whole" workload (leader + workers included) and not about the number of times the leader is interrupted. If there is say 100 workers then interrupting the leader (1 process out of 101) is probably less of an issue as it means that there is a lot of work to be done to have those 100 workers busy. I don't think the size of the index the leader is vacuuming has an impact. I think that having the leader vacuuming a 100 GB index or 100 x 1GB indexes is the same (as long as all the other workers are actives during all that time). > > 3. A 1 second reporting "throttling" looks a reasonable threshold as: > > > 3.1 the idea is to have a significant impact when the leader could have been > > interrupted say hundred/thousand times per second. > > > 3.2 it does not make that much sense for any tools to sample pg_stat_progress_vacuum > > multiple times per second (so a one second reporting granularity seems ok). > > I feel 1 second may still be too frequent. Maybe we'll need more measurements but this is what my test case made of: vacuum_cost_delay = 1 vacuum_cost_limit = 10 8 parallel workers, 1 leader 21 indexes (about 1GB each, one 40MB), all in memory lead to: With 1 second reporting frequency, the leader has been interruped about 2500 times over 8m39s leading to about the same time as on master (8m43s). > What about 10 seconds ( or 30 seconds )? I'm not sure (may need more measurements) but it would probably complicate the reporting a bit (as with the current v3 we'd miss reporting the indexes that take less time than the threshold to complete). > I think this metric in particular will be mainly useful for vacuum runs that are > running for minutes or more, making reporting every 10 or 30 seconds > still useful. Agree. OTOH, one could be interested to diagnose what happened during a say 5 seconds peak on I/O resource consumption/latency. Sampling pg_stat_progress_vacuum at 1 second interval and see by how much the vaccum has been paused during that time could help too (specially if it is made of a lot of parallel workers that could lead to a lot of I/O). But it would miss data if we are reporting at a larger rate. > It just occurred to me also that pgstat_progress_parallel_incr_param > should have a code comment that it will interrupt a leader process and > cause activity such as a sleep to end early. Good point, I'll add a comment for it. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: