Re: Why does VACUUM FULL pg_class sometimes wait for ShareLock on another transaction after getting AccessExclusiveLock on pg_class? - Mailing list pgsql-general

From Marcin Barczynski
Subject Re: Why does VACUUM FULL pg_class sometimes wait for ShareLock on another transaction after getting AccessExclusiveLock on pg_class?
Date
Msg-id CAOhG4wcheK1osPwTM=u2tYLCxR9HViZq95Q_p8kExuiEGy1mrA@mail.gmail.com
Whole thread Raw
In response to Why does VACUUM FULL pg_class sometimes wait for ShareLock on another transaction after getting AccessExclusiveLock on pg_class?  (Marcin Barczynski <mbarczynski@starfishstorage.com>)
Responses Re: Why does VACUUM FULL pg_class sometimes wait for ShareLock on another transaction after getting AccessExclusiveLock on pg_class?  (Marcin Barczynski <mbarczynski@starfishstorage.com>)
List pgsql-general
On Thu, Jul 22, 2021 at 3:51 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> On Thu, 2021-07-22 at 12:55 +0200, Marcin Barczynski wrote:
> > There was a long-running transaction consisting of two queries:
> >
> >     CREATE TEMPORARY TABLE abc AS SELECT * FROM def_view;
> >     INSERT INTO xyz_table SELECT * FROM abc;
> >    
> > When I ran VACUUM FULL pg_class, it waited for ShareLock on that transaction:
>
> There must have been something else using "pg_class", since the above
> won't take any permament locks on "pg_class", nor should it block VACUUM.


Thanks for your reply. I dugged a bit deeper, and it turned out that VACUUM FULL hung in heapam_index_build_range_scan.
It's PostgreSQL 13.3. Comments around heapam_handler.c:1482:

/*
* Since caller should hold ShareLock or better, normally
* the only way to see this is if it was inserted earlier
* in our own transaction.  However, it can happen in
* system catalogs, since we tend to release write lock
* before commit there.  Give a warning if neither case
* applies.
*/

/*
 * If we are performing uniqueness checks, indexing
 * such a tuple could lead to a bogus uniqueness
 * failure.  In that case we wait for the inserting
 * transaction to finish and check again.
 */

Not sure how to interpret those comments, as:
- on the one hand, the long running transaction created a temporary table, so it had to update pg_class,
- but on the other hand pg_locks showed no sign of lock on any of 3 indexes of pg_class, and locks once obtained are kept till the end of transaction.

Full stack trace below:

#0  0x00007f3537b52e43 in ?? () from /lib64/libc.so.6
#1  0x000000000076d0ae in WaitEventSetWaitBlock (nevents=1, occurred_events=0x7ffd7bae7620, cur_timeout=-1, set=0x26c2570) at latch.c:1295
#2  WaitEventSetWait (set=set@entry=0x26c2570, timeout=timeout@entry=-1, occurred_events=occurred_events@entry=0x7ffd7bae7620, nevents=nevents@entry=1, wait_event_info=wait_event_info@entry=50331653) at latch.c:1247
#3  0x000000000076d52d in WaitLatchOrSocket (latch=0x2aacc9ed95e4, wakeEvents=wakeEvents@entry=33, sock=sock@entry=-1, timeout=-1, timeout@entry=0, wait_event_info=50331653) at latch.c:428
#4  0x000000000076d600 in WaitLatch (latch=<optimized out>, wakeEvents=wakeEvents@entry=33, timeout=timeout@entry=0, wait_event_info=<optimized out>) at latch.c:368
#5  0x000000000078625f in ProcSleep (locallock=locallock@entry=0x25c4c90, lockMethodTable=lockMethodTable@entry=0x9fa160 <default_lockmethod>) at proc.c:1292
#6  0x000000000077a4a2 in WaitOnLock (locallock=locallock@entry=0x25c4c90, owner=owner@entry=0x261cf58) at lock.c:1859
#7  0x000000000077b7cf in LockAcquireExtended (locktag=locktag@entry=0x7ffd7bae79a0, lockmode=lockmode@entry=5, sessionLock=sessionLock@entry=false, dontWait=dontWait@entry=false, reportMemoryError=reportMemoryError@entry=true, locallockp=locallockp@entry=0x0) at lock.c:1101
#8  0x000000000077c024 in LockAcquire (locktag=locktag@entry=0x7ffd7bae79a0, lockmode=lockmode@entry=5, sessionLock=sessionLock@entry=false, dontWait=dontWait@entry=false) at lock.c:752
#9  0x0000000000779758 in XactLockTableWait (xid=xid@entry=113551215, rel=rel@entry=0x7f353a2bcdd8, ctid=ctid@entry=0x26c2194, oper=oper@entry=XLTW_InsertIndexUnique) at lmgr.c:673
#10 0x00000000004ce843 in heapam_index_build_range_scan (heapRelation=0x7f353a2bcdd8, indexRelation=0x7f353a387678, indexInfo=0x26c2030, allow_sync=<optimized out>, anyvisible=false, progress=true, start_blockno=0, numblocks=4294967295, callback=0x4e80e0 <_bt_build_callback>,
    callback_state=0x7ffd7bae7e70, scan=0x26c2140) at heapam_handler.c:1482
#11 0x00000000004e828a in table_index_build_scan (scan=<optimized out>, callback_state=0x7ffd7bae7e70, callback=0x4e80e0 <_bt_build_callback>, progress=true, allow_sync=true, index_info=0x26c2030, index_rel=<optimized out>, table_rel=<optimized out>)
    at ../../../../src/include/access/tableam.h:1536
#12 _bt_parallel_scan_and_sort (btspool=btspool@entry=0x26c1fe0, btspool2=0x0, btshared=0x7f353a3fa300, sharedsort=0x7f353a3fa2a0, sharedsort2=0x0, sortmem=<optimized out>, progress=progress@entry=true) at nbtsort.c:1970
#13 0x00000000004e9017 in _bt_leader_participate_as_worker (buildstate=<optimized out>, buildstate=<optimized out>) at nbtsort.c:1778
#14 _bt_begin_parallel (request=<optimized out>, isconcurrent=<optimized out>, buildstate=0x7ffd7bae7fb0) at nbtsort.c:1650
#15 _bt_spools_heapscan (indexInfo=0x26bf158, buildstate=0x7ffd7bae7fb0, index=..., heap=0x7f353a2bcdd8) at nbtsort.c:394
#16 btbuild (heap=0x7f353a2bcdd8, index=0x7f353a387678, indexInfo=0x26bf158) at nbtsort.c:326
#17 0x000000000053f3e6 in index_build (heapRelation=heapRelation@entry=0x7f353a2bcdd8, indexRelation=indexRelation@entry=0x7f353a387678, indexInfo=indexInfo@entry=0x26bf158, isreindex=isreindex@entry=true, parallel=parallel@entry=true) at index.c:2962
#18 0x0000000000541c73 in reindex_index (indexId=indexId@entry=2662, skip_constraint_checks=<optimized out>, persistence=persistence@entry=112 'p', options=options@entry=0) at index.c:3598
#19 0x0000000000542127 in reindex_relation (relid=relid@entry=1259, flags=flags@entry=18, options=options@entry=0) at index.c:3821
#20 0x00000000005ae37f in finish_heap_swap (OIDOldHeap=OIDOldHeap@entry=1259, OIDNewHeap=OIDNewHeap@entry=1325070585, is_system_catalog=is_system_catalog@entry=true, swap_toast_by_content=swap_toast_by_content@entry=false, check_constraints=check_constraints@entry=false,
    is_internal=is_internal@entry=true, frozenXid=frozenXid@entry=113424494, cutoffMulti=cutoffMulti@entry=10214958, newrelpersistence=newrelpersistence@entry=112 'p') at cluster.c:1393
#21 0x00000000005aeb1d in rebuild_relation (verbose=<optimized out>, indexOid=0, OldHeap=<optimized out>) at cluster.c:587
#22 cluster_rel (tableOid=tableOid@entry=1259, indexOid=indexOid@entry=0, options=<optimized out>) at cluster.c:403
#23 0x000000000061d60b in vacuum_rel (relid=1259, relation=<optimized out>, params=params@entry=0x7ffd7bae87c0) at vacuum.c:1890


--
Regards,
Marcin Barczynski

pgsql-general by date:

Previous
From: Ninad Shah
Date:
Subject: Re: Doubt on pgbouncer
Next
From: Mark Lybarger
Date:
Subject: version 0 calling convention