"as quickly as possible" (was: remove spurious CREATE INDEX CONCURRENTLY wait) - Mailing list pgsql-hackers

From Alvaro Herrera
Subject "as quickly as possible" (was: remove spurious CREATE INDEX CONCURRENTLY wait)
Date
Msg-id 20201118214127.GA3179@alvherre.pgsql
Whole thread Raw
In response to Re: remove spurious CREATE INDEX CONCURRENTLY wait  (Andres Freund <andres@anarazel.de>)
Responses Re: "as quickly as possible" (was: remove spurious CREATE INDEX CONCURRENTLY wait)  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
> On 2020-11-17 12:55:01 -0300, Alvaro Herrera wrote:
> > ... ah, but I realize now that this means that we can use shared lock
> > here, not exclusive, which is already an enormous improvement.  That's
> > because ->pgxactoff can only be changed with exclusive lock held; so as
> > long as we hold shared, the array item cannot move.
> 
> Uh, wait a second. The acquisition of this lock hasn't been affected by
> the snapshot scalability changes, and therefore are unrelated to
> ->pgxactoff changing or not.

I'm writing a response trying to thoroughly analyze this, but I also
wanted to report that ProcSleep is being a bit generous with what it
calls "as quickly as possible" here:

            LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);

            /*
             * Only do it if the worker is not working to protect against Xid
             * wraparound.
             */
            statusFlags = ProcGlobal->statusFlags[autovac->pgxactoff];
            if ((statusFlags & PROC_IS_AUTOVACUUM) &&
                !(statusFlags & PROC_VACUUM_FOR_WRAPAROUND))
            {
                int            pid = autovac->pid;
                StringInfoData locktagbuf;
                StringInfoData logbuf;    /* errdetail for server log */

                initStringInfo(&locktagbuf);
                initStringInfo(&logbuf);
                DescribeLockTag(&locktagbuf, &lock->tag);
                appendStringInfo(&logbuf,
                                 _("Process %d waits for %s on %s."),
                                 MyProcPid,
                                 GetLockmodeName(lock->tag.locktag_lockmethodid,
                                                 lockmode),
                                 locktagbuf.data);

                /* release lock as quickly as possible */
                LWLockRelease(ProcArrayLock);

The amount of stuff that this is doing with ProcArrayLock held
exclusively seems a bit excessive; it sounds like we could copy the
values we need first, release the lock, and *then* do all that memory
allocation and string printing -- it's a lock of code for such a
contended lock.  Anytime a process sees itself as blocked by autovacuum
and wants to signal it, there's a ProcArrayLock hiccup (I didn't
actually measure it, but it's at least five function calls).  We could
make this more concurrent by copying lock->tag to a local variable,
releasing the lock, then doing all the string formatting and printing.
See attached quickly.patch.

Now, when this code was written (d7318d43d, 2012), this was a LOG
message; it was demoted to DEBUG1 later (d8f15c95bec, 2015).  I think it
would be fair to ... remove the message?  Or go back to Simon's original
formulation from commit acac68b2bca, which had this message as DEBUG2
without any string formatting.

Thoughts?

Attachment

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: WIP: WAL prefetch (another approach)
Next
From: "David G. Johnston"
Date:
Subject: Re: CREATE AGGREGATE array_cat