On Wed, Dec 29, 2021 at 5:46 AM Stephen Frost <sfrost@snowman.net> wrote:
Greetings,
* SATYANARAYANA NARLAPURAM (satyanarlapuram@gmail.com) wrote: > On Sat, Dec 25, 2021 at 9:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Sun, Dec 26, 2021 at 10:36 AM SATYANARAYANA NARLAPURAM < > > satyanarlapuram@gmail.com> wrote: > >>> Actually all the WAL insertions are done under a critical section > >>> (except few exceptions), that means if you see all the references of > >>> XLogInsert(), it is always called under the critical section and that is my > >>> main worry about hooking at XLogInsert level. > >>> > >> > >> Got it, understood the concern. But can we document the limitations of > >> the hook and let the hook take care of it? I don't expect an error to be > >> thrown here since we are not planning to allocate memory or make file > >> system calls but instead look at the shared memory state and add delays > >> when required. > >> > >> > > Yet another problem is that if we are in XlogInsert() that means we are > > holding the buffer locks on all the pages we have modified, so if we add a > > hook at that level which can make it wait then we would also block any of > > the read operations needed to read from those buffers. I haven't thought > > what could be better way to do this but this is certainly not good. > > > > Yes, this is a problem. The other approach is adding a hook at > XLogWrite/XLogFlush? All the other backends will be waiting behind the > WALWriteLock. The process that is performing the write enters into a busy > loop with small delays until the criteria are met. Inability to process the > interrupts inside the critical section is a challenge in both approaches. > Any other thoughts?
Why not have this work the exact same way sync replicas do, except that it's based off of some byte/time lag for some set of async replicas? That is, in RecordTransactionCommit(), perhaps right after the SyncRepWaitForLSN() call, or maybe even add this to that function? Sure seems like there's a lot of similarity.
I was thinking of achieving log governance (throttling WAL MB/sec) and also providing RPO guarantees. In this model, it is hard to throttle WAL generation of a long running transaction (for example copy/select into). However, this meets my RPO needs. Are you in support of adding a hook or the actual change? IMHO, the hook allows more creative options. I can go ahead and make a patch accordingly.