Re: Track the amount of time waiting due to cost_delay - Mailing list pgsql-hackers

From Bertrand Drouvot
Subject Re: Track the amount of time waiting due to cost_delay
Date
Msg-id ZmmOOPwMFIltkdsN@ip-10-97-1-34.eu-west-3.compute.internal
Whole thread Raw
In response to Re: Track the amount of time waiting due to cost_delay  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

On Tue, Jun 11, 2024 at 01:13:48PM -0400, Robert Haas wrote:
> On Tue, Jun 11, 2024 at 5:49 AM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> > As we can see the actual wait time is 30ms less than the intended wait time with
> > this simple test. So I still think we should go with 1) actual wait time and 2)
> > report the number of waits (as mentioned in [1]). Does that make sense to you?
> 
> I like the idea of reporting the actual wait time better,

+1

> provided
> that we verify that doing so isn't too expensive. I think it probably
> isn't, because in a long-running VACUUM there is likely to be disk
> I/O, so the CPU overhead of a few extra gettimeofday() calls should be
> fairly low by comparison.

Agree.

> I wonder if there's a noticeable hit when
> everything is in-memory. I guess probably not, because with any sort
> of normal configuration, we shouldn't be delaying after every block we
> process, so the cost of those gettimeofday() calls should still be
> getting spread across quite a bit of real work.

I did some testing, with:

shared_buffers = 12GB
vacuum_cost_delay = 1
autovacuum_vacuum_cost_delay = 1
max_parallel_maintenance_workers = 0
max_parallel_workers = 0

added to a default config file.

A table and all its indexes were fully in memory, the numbers are:

postgres=# SELECT n.nspname, c.relname, count(*) AS buffers
             FROM pg_buffercache b JOIN pg_class c
             ON b.relfilenode = pg_relation_filenode(c.oid) AND
                b.reldatabase IN (0, (SELECT oid FROM pg_database
                                      WHERE datname = current_database()))
             JOIN pg_namespace n ON n.oid = c.relnamespace
             GROUP BY n.nspname, c.relname
             ORDER BY 3 DESC
             LIMIT 11;

 nspname |      relname      | buffers
---------+-------------------+---------
 public  | large_tbl         |  222280
 public  | large_tbl_pkey    |    5486
 public  | large_tbl_filler7 |    1859
 public  | large_tbl_filler4 |    1859
 public  | large_tbl_filler1 |    1859
 public  | large_tbl_filler6 |    1859
 public  | large_tbl_filler3 |    1859
 public  | large_tbl_filler2 |    1859
 public  | large_tbl_filler5 |    1859
 public  | large_tbl_filler8 |    1859
 public  | large_tbl_version |    1576
(11 rows)


The observed timings when vacuuming this table are:

On master:

vacuum phase: cumulative duration
---------------------------------

scanning heap: 00:00:37.808184
vacuuming indexes: 00:00:41.808176
vacuuming heap: 00:00:54.808156

On master patched with actual time delayed:

vacuum phase: cumulative duration
---------------------------------

scanning heap: 00:00:36.502104 (time_delayed: 22202)
vacuuming indexes: 00:00:41.002103 (time_delayed: 23769)
vacuuming heap: 00:00:54.302096 (time_delayed: 34886)

As we can see there is no noticeable degradation while the vacuum entered about
34886 times in this instrumentation code path (cost_delay was set to 1).

> That said, I'm not sure this experiment shows a real problem with the
> idea of showing intended wait time. It does establish the concept that
> repeated signals can throw our numbers off, but 30ms isn't much of a
> discrepancy.

Yeah, the idea was just to show how easy it is to create a 30ms discrepancy.

> I'm worried about being off by a factor of two, or an
> order of magnitude. I think we still don't know if that can happen,
> but if we're going to show actual wait time anyway, then we don't need
> to explore the problems with other hypothetical systems too much.

Agree.

> I'm not convinced that reporting the number of waits is useful. If we
> were going to report a possibly-inaccurate amount of actual waiting,
> then also reporting the number of waits might make it easier to figure
> out when the possibly-inaccurate number was in fact inaccurate. But I
> think it's way better to report an accurate amount of actual waiting,
> and then I'm not sure what we gain by also reporting the number of
> waits.

Sami shared his thoughts in [1] and [2] and so did I in [3]. If some of us still
don't think that reporting the number of waits is useful then we can probably
start without it.

[1]: https://www.postgresql.org/message-id/0EA474B6-BF88-49AE-82CA-C1A9A3C17727%40amazon.com
[2]: https://www.postgresql.org/message-id/E12435E2-5FCA-49B0-9ADB-0E7153F95E2D%40amazon.com
[3]: https://www.postgresql.org/message-id/ZmmGG4e%2BqTBD2kfn%40ip-10-97-1-34.eu-west-3.compute.internal

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: RFC: adding pytest as a supported test framework
Next
From: Jelte Fennema-Nio
Date:
Subject: Re: RFC: adding pytest as a supported test framework