Re: WAL usage calculation patch - Mailing list pgsql-hackers
From | Julien Rouhaud |
---|---|
Subject | Re: WAL usage calculation patch |
Date | |
Msg-id | 20200329075549.6l54fcjfh73rxufb@nol Whole thread Raw |
In response to | Re: WAL usage calculation patch (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: WAL usage calculation patch
|
List | pgsql-hackers |
On Sun, Mar 29, 2020 at 11:03:50AM +0530, Amit Kapila wrote: > On Sat, Mar 28, 2020 at 7:08 PM Julien Rouhaud <rjuju123@gmail.com> wrote: > > > > On Sat, Mar 28, 2020 at 04:14:04PM +0530, Amit Kapila wrote: > > > > > > Basically, > > > I don't know changes done in ExecInitParallelPlan and friends allow us > > > to compute WAL for parallel operations. Those will primarily cover > > > parallel queries that won't write WAL. How you have tested those > > > changes? > > > > I didn't tested those, and I'm not even sure how to properly and reliably test > > that. Do you have any advice on how to achieve that? > > > > However the patch is mimicking the buffer instrumentation that already exists, > > and the approach also looks correct to me. Do you have a reason to believe > > that the approach that works for buffer usage wouldn't work for WAL records? (I > > of course agree that this should be tested anyway) > > > > The buffer usage infrastructure is for read-only queries (for ex. for > stats like blks_hit, blks_read). As far as I can think, there is no > easy way to test the WAL usage via that API. It might or might not be > required in the future depending on whether we decide to use the same > infrastructure for parallel writes. I'm not sure that I get your point. I'm assuming that you meant parallel-read-only queries, but surely buffer usage infrastructure for parallel query relies on the same approach as non-parallel one (each node computes the process-local pgBufferUsage diff) and sums all of that at the end of the parallel query execution. I also don't see how whether the query is read-only or not is relevant here as far as instrumentation is concerned, especially since read-only query can definitely do writes and increase the count of dirtied buffers, like a write query would. For instance a hint bit change can be done in a parallel query AFAIK, and this can generate WAL records in wal_log_hints is enabled, so that's probably one way to test it. I now think that not adding support for WAL buffers in EXPLAIN output in the initial patch scope was a mistake, as this is probably the best way to test the WAL counters for parallel queries. This shouldn't be hard to add though, and I can work on it quickly if there's still a chance to get this feature included in pg13. > I think for now we should remove > that part of changes and rather think how to get that for parallel > operations that can write WAL. For ex. we might need to do something > similar to what this patch has done in begin_parallel_vacuum and > end_parallel_vacuum. Would you like to attempt that? Do you mean removing WAL buffers instrumentation from parallel query infrastructure? For parallel utility that can do writes it's probably better to keep the discussion in the other part of the thread. I tried to think a little bit about that, but for now I don't have a better idea than adding something similar to intrumentation for utility command to have a general infrastructure, as building a workaround for specific utility looks like the wrong approach. But this would require quite import changes in utility handling, which is maybe not a good idea a couple of week before the feature freeze, and that is definitely not backpatchable so that won't fix the issue for parallel index build that exists since pg11.
pgsql-hackers by date: