Thread: Avoid unnecessary ReplicationSlotControl lwlock acquistion
Hi all, While testing with DTrace, I realized we acquire ReplicationSlotControl lwlock at some places even when max_replication_slots is set to 0. For instance, we call ReplicationSlotCleanup() within PostgresMian() when an error happens and acquire ReplicationSlotControl lwlock. The attached patch fixes some functions so that we quickly return if max_replication_slots is set to 0. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Hi, On August 24, 2020 7:38:39 PM PDT, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote: >Hi all, > >While testing with DTrace, I realized we acquire >ReplicationSlotControl lwlock at some places even when >max_replication_slots is set to 0. For instance, we call >ReplicationSlotCleanup() within PostgresMian() when an error happens >and acquire ReplicationSlotControl lwlock. > >The attached patch fixes some functions so that we quickly return if >max_replication_slots is set to 0. Why is it worth doing so? Regards, Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
On Tue, 25 Aug 2020 at 11:42, Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On August 24, 2020 7:38:39 PM PDT, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote: > >Hi all, > > > >While testing with DTrace, I realized we acquire > >ReplicationSlotControl lwlock at some places even when > >max_replication_slots is set to 0. For instance, we call > >ReplicationSlotCleanup() within PostgresMian() when an error happens > >and acquire ReplicationSlotControl lwlock. > > > >The attached patch fixes some functions so that we quickly return if > >max_replication_slots is set to 0. > > Why is it worth doing so? I think we can avoid unnecessary overhead caused by acquiring and releasing that lwlock itself. The functions modified by this patch are called during error cleanup or checkpoints. For the former case, since it’s not a commit path the benefit might not be large on common workload but it might help to reduce the latency on a workload whose abort rate is relatively high. Also looking at other functions in slot.c, other functions also do so. I think these are also for preventing unnecessary overhead. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On 2020-08-25 12:00:47 +0900, Masahiko Sawada wrote: > On Tue, 25 Aug 2020 at 11:42, Andres Freund <andres@anarazel.de> wrote: > > > > Hi, > > > > On August 24, 2020 7:38:39 PM PDT, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote: > > >Hi all, > > > > > >While testing with DTrace, I realized we acquire > > >ReplicationSlotControl lwlock at some places even when > > >max_replication_slots is set to 0. For instance, we call > > >ReplicationSlotCleanup() within PostgresMian() when an error happens > > >and acquire ReplicationSlotControl lwlock. > > > > > >The attached patch fixes some functions so that we quickly return if > > >max_replication_slots is set to 0. > > > > Why is it worth doing so? > > I think we can avoid unnecessary overhead caused by acquiring and > releasing that lwlock itself. The functions modified by this patch are > called during error cleanup or checkpoints. For the former case, > since it’s not a commit path the benefit might not be large on common > workload but it might help to reduce the latency on a workload whose > abort rate is relatively high. Also looking at other functions in > slot.c, other functions also do so. I think these are also for > preventing unnecessary overhead. I don't see how these could matter. The error checking path isn't reached if no slot is acquired. One uncontended lwlock acquisition isn't measurable compared to the cost of a checkpoint. Greetings, Andres Freund
On Tue, 25 Aug 2020 at 13:13, Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2020-08-25 12:00:47 +0900, Masahiko Sawada wrote: > > On Tue, 25 Aug 2020 at 11:42, Andres Freund <andres@anarazel.de> wrote: > > > > > > Hi, > > > > > > On August 24, 2020 7:38:39 PM PDT, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote: > > > >Hi all, > > > > > > > >While testing with DTrace, I realized we acquire > > > >ReplicationSlotControl lwlock at some places even when > > > >max_replication_slots is set to 0. For instance, we call > > > >ReplicationSlotCleanup() within PostgresMian() when an error happens > > > >and acquire ReplicationSlotControl lwlock. > > > > > > > >The attached patch fixes some functions so that we quickly return if > > > >max_replication_slots is set to 0. > > > > > > Why is it worth doing so? > > > > I think we can avoid unnecessary overhead caused by acquiring and > > releasing that lwlock itself. The functions modified by this patch are > > called during error cleanup or checkpoints. For the former case, > > since it’s not a commit path the benefit might not be large on common > > workload but it might help to reduce the latency on a workload whose > > abort rate is relatively high. Also looking at other functions in > > slot.c, other functions also do so. I think these are also for > > preventing unnecessary overhead. > > I don't see how these could matter. The error checking path isn't > reached if no slot is acquired. I think we always call ReplicationSlotCleanup() which acquires the lwlock during error cleanup even if no slot is acquired. > One uncontended lwlock acquisition isn't > measurable compared to the cost of a checkpoint. That's true. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services