Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic - Mailing list pgsql-hackers
From | Justin Pryzby |
---|---|
Subject | Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic |
Date | |
Msg-id | 20210608110329.GB16435@telsasoft.com Whole thread Raw |
In response to | Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic
Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic |
List | pgsql-hackers |
On Sun, Jun 06, 2021 at 11:00:38AM -0700, Peter Geoghegan wrote: > On Sun, Jun 6, 2021 at 9:35 AM Justin Pryzby <pryzby@telsasoft.com> wrote: > > I'll leave the instance running for a little bit before restarting (or kill-9) > > in case someone requests more info. > > How about dumping the page image out, and sharing it with the list? > This procedure should work fine from gdb: > > https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD#Dumping_a_page_image_from_within_GDB > I suggest that you dump the "page" pointer inside lazy_scan_prune(). I > imagine that you have the instance already stuck in an infinite loop, > so what we'll probably see from the page image is the page after the > first prune and another no-progress prune. The cluster was again rejecting with "too many clients already". I was able to open a shell this time, but it immediately froze when I tried to tab complete "pg_stat_acti"... I was able to dump the page image, though - attached. I can send you its "data" privately, if desirable. I'll also try to step through this. postgres=# SELECT lp, lp_off, lp_flags, lp_len, t_xmin, t_xmax, t_field3, t_ctid, t_infomask2, t_infomask, t_hoff, t_bits,t_oid FROM heap_page_items(pg_read_binary_file('/tmp/dump_block.page')); lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid ----+--------+----------+--------+-----------+-----------+----------+--------+-------------+------------+--------+----------------------------------+------- 1 | 1320 | 1 | 259 | 926025112 | 0 | 0 | (1,1) | 32799 | 10499 | 32 | 11111111111111111111111000100000| 2 | 8088 | 1 | 104 | 926018702 | 0 | 0 | (1,2) | 32799 | 10497 | 32 | 11111111111111111111100000000000| 3 | 0 | 0 | 0 | | | | | | | | | 4 | 7904 | 1 | 179 | 926018702 | 0 | 0 | (1,4) | 32799 | 10499 | 32 | 11111111111111111111111000100000| 5 | 7728 | 1 | 176 | 926018702 | 0 | 0 | (1,5) | 32799 | 10499 | 32 | 11111111111111111111111000100000| 6 | 7464 | 1 | 259 | 926014884 | 926025112 | 0 | (1,1) | 49183 | 9475 | 32 | 11111111111111111111111000100000| 7 | 2 | 2 | 0 | | | | | | | | | 8 | 4 | 2 | 0 | | | | | | | | | 9 | 19 | 2 | 0 | | | | | | | | | 10 | 0 | 0 | 0 | | | | | | | | | 11 | 20 | 2 | 0 | | | | | | | | | 12 | 5792 | 1 | 1666 | 926014887 | 0 | 0 | (1,12) | 31 | 10499 | 32 | 11111111111111111111101000100000| 13 | 5 | 2 | 0 | | | | | | | | | 14 | 3912 | 1 | 1880 | 925994211 | 0 | 0 | (1,14) | 31 | 10499 | 32 | 11111111111111111111110100110000| 15 | 0 | 3 | 0 | | | | | | | | | 16 | 18 | 2 | 0 | | | | | | | | | 17 | 0 | 3 | 0 | | | | | | | | | 18 | 1936 | 1 | 1976 | 926013259 | 0 | 0 | (1,18) | 32799 | 10499 | 32 | 11111111111111111111110100110000| 19 | 1760 | 1 | 176 | 926014887 | 0 | 0 | (1,19) | 32799 | 10499 | 32 | 11111111111111111111111000100000| 20 | 1584 | 1 | 176 | 926014889 | 0 | 0 | (1,20) | 32799 | 10499 | 32 | 11111111111111111111111000100000| 21 | 6 | 2 | 0 | | | | | | | | | 22 | 0 | 3 | 0 | | | | | | | | | 23 | 0 | 3 | 0 | | | | | | | | | 24 | 0 | 3 | 0 | | | | | | | | | 25 | 0 | 3 | 0 | | | | | | | | | 26 | 0 | 3 | 0 | | | | | | | | | 27 | 0 | 3 | 0 | | | | | | | | | 28 | 0 | 3 | 0 | | | | | | | | | (28 rows) No great surprise that it's again in pg_statistic. #0 GetPrivateRefCountEntry (buffer=buffer@entry=411, do_move=do_move@entry=false) at bufmgr.c:313 #1 0x00000000007ecb4f in GetPrivateRefCount (buffer=411) at bufmgr.c:398 #2 BufferGetBlockNumber (buffer=buffer@entry=411) at bufmgr.c:2762 #3 0x00000000004fa0f3 in heap_prune_chain (prstate=0x7fff7e4a9180, rootoffnum=7, buffer=411) at pruneheap.c:625 #4 heap_page_prune (relation=relation@entry=0x7fe636faed28, buffer=buffer@entry=411, vistest=vistest@entry=0xe7bcc0 <GlobalVisCatalogRels>,old_snap_xmin=old_snap_xmin@entry=0, old_snap_ts=old_snap_ts@entry=0, report_stats=report_stats@entry=false, off_loc=<optimized out>, off_loc@entry=0x12b433c) at pruneheap.c:278 #5 0x00000000004fd9bf in lazy_scan_prune (vacrel=vacrel@entry=0x12b42d0, buf=buf@entry=411, blkno=blkno@entry=1, page=page@entry=0x2aaaab54be00"J\f", vistest=vistest@entry=0xe7bcc0 <GlobalVisCatalogRels>, prunestate=prunestate@entry=0x7fff7e4aae40) at vacuumlazy.c:1712 #6 0x0000000000500263 in lazy_scan_heap (aggressive=<optimized out>, params=0x12ce89c, vacrel=<optimized out>) at vacuumlazy.c:1347 #7 heap_vacuum_rel (rel=0x7fe636faed28, params=0x12ce89c, bstrategy=<optimized out>) at vacuumlazy.c:612 #8 0x000000000067418a in table_relation_vacuum (bstrategy=<optimized out>, params=0x12ce89c, rel=0x7fe636faed28) at ../../../src/include/access/tableam.h:1678 #9 vacuum_rel (relid=2619, relation=<optimized out>, params=params@entry=0x12ce89c) at vacuum.c:2001 -- Justin
Attachment
pgsql-hackers by date: