Re: Minimal logical decoding on standbys - Mailing list pgsql-hackers

From Amit Khandekar
Subject Re: Minimal logical decoding on standbys
Date
Msg-id CAJ3gD9dC9oMGXqCAeisNOK2BkeaiOTYO2aTMFzi9SHnKtu4PJg@mail.gmail.com
Whole thread Raw
In response to Re: Minimal logical decoding on standbys  (Amit Khandekar <amitdkhan.pg@gmail.com>)
Responses Re: Minimal logical decoding on standbys
List pgsql-hackers
On Tue, 9 Apr 2019 at 22:23, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
>
> On Sat, 6 Apr 2019 at 04:45, Andres Freund <andres@anarazel.de> wrote:
> > > diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
> > > index 006446b..5785d2f 100644
> > > --- a/src/backend/replication/slot.c
> > > +++ b/src/backend/replication/slot.c
> > > @@ -1064,6 +1064,85 @@ ReplicationSlotReserveWal(void)
> > >       }
> > >  }
> > >
> > > +void
> > > +ResolveRecoveryConflictWithSlots(Oid dboid, TransactionId xid)
> > > +{
> > > +     int                     i;
> > > +     bool            found_conflict = false;
> > > +
> > > +     if (max_replication_slots <= 0)
> > > +             return;
> > > +
> > > +restart:
> > > +     if (found_conflict)
> > > +     {
> > > +             CHECK_FOR_INTERRUPTS();
> > > +             /*
> > > +              * Wait awhile for them to die so that we avoid flooding an
> > > +              * unresponsive backend when system is heavily loaded.
> > > +              */
> > > +             pg_usleep(100000);
> > > +             found_conflict = false;
> > > +     }
> > > +
> > > +     LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
> > > +     for (i = 0; i < max_replication_slots; i++)
> > > +     {
> > > +             ReplicationSlot *s;
> > > +             NameData        slotname;
> > > +             TransactionId slot_xmin;
> > > +             TransactionId slot_catalog_xmin;
> > > +
> > > +             s = &ReplicationSlotCtl->replication_slots[i];
> > > +
> > > +             /* cannot change while ReplicationSlotCtlLock is held */
> > > +             if (!s->in_use)
> > > +                     continue;
> > > +
> > > +             /* not our database, skip */
> > > +             if (s->data.database != InvalidOid && s->data.database != dboid)
> > > +                     continue;
> > > +
> > > +             SpinLockAcquire(&s->mutex);
> > > +             slotname = s->data.name;
> > > +             slot_xmin = s->data.xmin;
> > > +             slot_catalog_xmin = s->data.catalog_xmin;
> > > +             SpinLockRelease(&s->mutex);
> > > +
> > > +             if (TransactionIdIsValid(slot_xmin) && TransactionIdPrecedesOrEquals(slot_xmin, xid))
> > > +             {
> > > +                     found_conflict = true;
> > > +
> > > +                     ereport(WARNING,
> > > +                                     (errmsg("slot %s w/ xmin %u conflicts with removed xid %u",
> > > +                                                     NameStr(slotname), slot_xmin, xid)));
> > > +             }
> > > +
> > > +             if (TransactionIdIsValid(slot_catalog_xmin) && TransactionIdPrecedesOrEquals(slot_catalog_xmin,
xid))
> > > +             {
> > > +                     found_conflict = true;
> > > +
> > > +                     ereport(WARNING,
> > > +                                     (errmsg("slot %s w/ catalog xmin %u conflicts with removed xid %u",
> > > +                                                     NameStr(slotname), slot_catalog_xmin, xid)));
> > > +             }
> > > +
> > > +
> > > +             if (found_conflict)
> > > +             {
> > > +                     elog(WARNING, "Dropping conflicting slot %s", s->data.name.data);
> > > +                     LWLockRelease(ReplicationSlotControlLock);      /* avoid deadlock */
> > > +                     ReplicationSlotDropPtr(s);
> > > +
> > > +                     /* We released the lock above; so re-scan the slots. */
> > > +                     goto restart;
> > > +             }
> > > +     }
> > >
> > I think this should be refactored so that the two found_conflict cases
> > set a 'reason' variable (perhaps an enum?) to the particular reason, and
> > then only one warning should be emitted.  I also think that LOG might be
> > more appropriate than WARNING - as confusing as that is, LOG is more
> > severe than WARNING (see docs about log_min_messages).
>
> What I have in mind is :
>
> ereport(LOG,
> (errcode(ERRCODE_INTERNAL_ERROR),
> errmsg("Dropping conflicting slot %s", s->data.name.data),
> errdetail("%s, removed xid %d.", conflict_str, xid)));
> where conflict_str is a dynamically generated string containing
> something like : "slot xmin : 1234, slot catalog_xmin: 5678"
> So for the user, the errdetail will look like :
> "slot xmin: 1234, catalog_xmin: 5678, removed xid : 9012"
> I think the user can figure out whether it was xmin or catalog_xmin or
> both that conflicted with removed xid.
> If we don't do this way, we may not be able to show in a single
> message if both xmin and catalog_xmin are conflicting at the same
> time.
>
> Does this message look good to you, or you had in mind something quite
> different ?

The above one is yet another point that needs to be concluded on. Till
then I will use the above way to display the error message in the
upcoming patch version.

-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company



pgsql-hackers by date:

Previous
From: Amit Khandekar
Date:
Subject: Re: Minimal logical decoding on standbys
Next
From: Michael Meskes
Date:
Subject: Re: SQL statement PREPARE does not work in ECPG