Hi,
> Just idea, if XactLockTableWait() is expected to finish within a few seconds
> after acquiring the lock, how about this approach: keep sleeping for 1ms
> until the total sleep time reaches 10s (10s is just an example),
> and after that, start doubling the sleep duration each cycle, up to
> a maximum of 10s. That is, even in non-"create replication slot" case,
> if the total sleep time exceeds 10s, it seems safe to double the sleep time
> up to 10s. This way, we stay responsive early on but can back off more
> aggressively if needed. Thought? Anyway we probably need to study
> XactLockTableWait() behavior more closely.
After some thoughts, I think that using an additional parameter to
distinguish replication cases from heap/index cases may not be the
optimal approach. Other scenarios, such as batch operations or
analytical workloads, could potentially cause long waits as well. An
alternative approach would be to adopt a hybrid strategy:
- For logical replication use cases: Apply exponential backoff immediately
- For other cases: Apply exponential backoff only after a certain
threshold is reached
However, this is more complicated. I don't see clear benefits for now.
> Thanks for the patch! I haven't reviewed it yet, but since this is
> a v19 item, please add it to the next CommitFest so we don't lose
> track of it.
I've added it to July's CommitFest.
https://commitfest.postgresql.org/patch/5804/
> Also, I think it would be better to split the addition of the wait event
> and the introduction of exponential backoff in XactLockTableWait() into
> separate patches. They serve different purposes and can be committed
> independently.
The following is the split patch for adding the wait event.